Paper Digest: Recent Papers on AI for Music
Paper Digest Team extracted all recent AI for Music related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive daily paper digests on the latest research in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.
Experience the full potential of our services today!
TABLE 1: Paper Digest: Recent Papers on AI for Music
| Paper | Author(s) | Source | Date | |
|---|---|---|---|---|
| 1 | Emotional Based Music Recommendation System for Mental Wellness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This project presents an Emotion-Based Music Recommendation System designed to enhance mental wellness through intelligent and personalized music therapy. |
Aayush Verdhan; | International Journal for Research in Applied Science and … | 2026-01-15 |
| 2 | HeartMuLa: A Family of Open Sourced Music Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a family of open-source Music Foundation Models designed to advance large-scale music understanding and generation across diverse tasks and modalities. |
DONGCHAO YANG et. al. | arxiv-cs.SD | 2026-01-15 |
| 3 | FusID: Modality-Fused Semantic IDs for Generative Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches that tokenize each modality independently face two critical limitations: (1) redundancy across modalities that reduces efficiency, and (2) failure to capture inter-modal interactions that limits item representation. We introduce FusID, a modality-fused semantic ID framework that addresses these limitations through three key components: (i) multimodal fusion that learns unified representations by jointly encoding information across modalities, (ii) representation learning that brings frequently co-occurring item embeddings closer while maintaining distinctiveness and preventing feature redundancy, and (iii) product quantization that converts the fused continuous embeddings into multiple discrete tokens to mitigate ID conflict. |
Haven Kim; Yupeng Hou; Julian McAuley; | arxiv-cs.IR | 2026-01-13 |
| 4 | Speech and Music Source Separation for Cochlear Implant Users: Front-end and End-to-end Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the present study, we compare front-end and end-to-end DNN-based source separation approaches for two tasks: speech masked by competing speech and singing music. |
Sina Tahmasebi; Waldo Nogueira; | Frontiers in Neuroscience | 2026-01-13 |
| 5 | Effectiveness of Music Therapy for Delirium in Acute Hospital Settings: A Scoping Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although there are reviews on non-pharmacological approaches to delirium, few have focused specifically on music therapy within acute hospital environments. This scoping review examined the evidence relating to music-based interventions for older adults who are experiencing delirium or who are at risk of delirium in acute care settings. |
Stacey Leonard; Elizabeth Henderson; Gary Mitchell; | Nursing Reports | 2026-01-12 |
| 6 | End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present the first truly end-to-end approach for page-level OMR in complex layouts. |
Antonio Ríos-Vila; Jorge Calvo-Zaragoza; David Rizo; Thierry Paquet; | International Journal of Computer Vision | 2026-01-09 |
| 7 | Music Consumption: A Systematic Review Across The Lifespan Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The present study aimed to systematically review research concerning changes in music consumption across the lifespan to better understand how adults of all ages consume music. |
Shannon J. Skeffington; Adam J. Lonsdale; Clare J. Rathbone; Mark Burgess; | Empirical Studies of the Arts | 2026-01-08 |
| 8 | Supervised Contrastive Models for Music Information Retrieval in Classical Persian Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ali Ahmadi Katamjani; Seyed Abolghasem Mirroshandel; Mahdi Aminian; | Transactions of the International Society for Music … | 2026-01-07 |
| 9 | Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We train Muse via single-stage supervised finetuning of a Qwen-based language model extended with discrete audio tokens using MuCodec, without task-specific losses, auxiliary objectives, or additional architectural components. |
CHANGHAO JIANG et. al. | arxiv-cs.SD | 2026-01-07 |
| 10 | Predicting Hit Songs: An Exploratory and Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study explores the connection between audio features from an objective statistical standpoint and identifies that tree-based ensembles perform better when dealing with tabular datasets. |
Yanrui Jerry Wu; | Scholarly Review Journal | 2026-01-05 |
| 11 | Understanding Human Perception of Music Plagiarism Through A Computational Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, we aim to conduct a study to examine the key criteria of human perception of music plagiarism, focusing on the three commonly used musical features in similarity analysis: melody, rhythm, and chord progression. After identifying the key features and levels of variation humans use in perceiving musical similarity, we propose a LLM-as-a-judge framework that applies a systematic, step-by-step approach, drawing on modules that extract such high-level attributes. |
Daeun Hwang; Hyeonbin Hwang; | arxiv-cs.SD | 2026-01-05 |
| 12 | Abstracts of The Resonance Annual Conference 2026: The Next Decade of Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This abstract collection records the conference abstracts and extended abstracts presented at the Resonance Annual Conference 2026, convened online under the theme “The Next … |
ANQI CHEN et. al. | Resonance: Journal of Global Music Studies | 2026-01-05 |
| 13 | SongSage: A Large Musical Language Model with Lyric Generative Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Comprehensive evaluations indicate that current general-purpose LLMs still have potential for improvement in playlist understanding. Inspired by this, we introduce SongSage, a large musical language model equipped with diverse lyric-centric intelligence through lyric generative pretraining. |
JIANI GUO et. al. | arxiv-cs.CL | 2026-01-03 |
| 14 | Distributed Cross‐Domain Music Style Transfer in The SAGIN Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these improvements face significant challenges when implemented in space‐air‐ground integrated network (SAGIN) environments, due to issues such as high latency, limited bandwidth, and privacy concerns. This research explores a distributed, cross‐domain music style transfer model based on SAGIN environments, proposing a federated learning (FL) approach to mitigate these challenges. |
Jinzi Huang; Chihhsiong Shih; | Transactions on Emerging Telecommunications Technologies | 2025-12-30 |
| 15 | Analisis Emosi Dalam Lirik Lagu Menggunakan Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music is a universal medium for expressing emotions, with song lyrics serving as a narrative component rich in affective content. This study aims to analyze the emotional landscape within popular English song lyrics collected from the Spotify platform and to examine the effectiveness of Natural Language Processing (NLP) approaches in classifying these emotions. |
Michael Sabda Husada; Sri Yulianto J.P; | Jurnal Indonesia : Manajemen Informatika dan Komunikasi | 2025-12-30 |
| 16 | Effects of Integrative Music Therapy in Children with Autism: A Multiple Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The primary objective of this study is to examine the potential of integrating five approaches— the Orff Schulwer, Dalcroze Eurhythmics, traditional Chinese music of Five Elements music, the Mozart Effect, and Neurologic Music Therapy—within a comprehensive framework of seven elements, for the improvement of attention processes, emotional self-regulation and social interaction in children diagnosed with Autism Spectrum Disorder (ASD) (American Psychiatric Association, 2022). |
Tianjiao Ma; Inmaculada Chiva Sanchis; Genoveva Ramos Santana; Zhenhan Liu; | RELIEVE – Revista Electrónica de Investigación y Evaluación … | 2025-12-30 |
| 17 | Multi Agents Semantic Emotion Aligned Music to Image Generation with Music Derived Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: When people listen to music, they often experience rich visual imagery. We aim to externalize this inner imagery by generating images conditioned on music. |
Junchang Shi; Gang Li; | arxiv-cs.MM | 2025-12-29 |
| 18 | Song Lyric Meaning Generator Using Transformer (Case Study: Drake’s Song Lyrics) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study aims to develop a web-based lyric meaning generator capable of automatically interpreting Drake’s lyrics using the Transformer architecture. |
Tubagus Alwasi’i; Sam Farisa Chaerul Haviana; | Journal of Software Engineering and Multimedia (JASMED) | 2025-12-29 |
| 19 | Direction Finding with Sparse Arrays Based on Variable Window Size Spatial Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a variable window size (VWS) spatial smoothing framework that enhances coarray-based direction of arrival (DOA) estimation for sparse linear arrays. |
Wesley S. Leite; Rodrigo C. de Lamare; Yuriy Zakharov; Wei Liu; Martin Haardt; | arxiv-cs.LG | 2025-12-26 |
| 20 | Predictive Controlled Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a new approach to algorithmic composition, called predictive controlled music (PCM), which combines model predictive control (MPC) with music generation. |
Midhun T. Augustine; | arxiv-cs.SD | 2025-12-26 |
| 21 | MUSIC, SPIRITUALITY, AND MENTAL HEALTH Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the scientific evidence on its impacts on mental health is dispersed. This systematic review, conducted under the PRISMA guidelines, seeks to consolidate the evidence on the relationship between religious/gospel music and outcomes in mental health, well-being, and resilience. |
Jessika Camila Souza Gonçalez; | Revista Gênero e Interdisciplinaridade | 2025-12-24 |
| 22 | Processing of Vocal Music Using Artificial Intelligence: Unveiling Creative Potential and Shaping Listener Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study quantitatively assesses the impact of AI on vocal-music processing. |
Ning Wang; Zhenyao Cai; | Perceptual and Motor Skills | 2025-12-23 |
| 23 | AutoSchA: Automatic Hierarchical Music Representations Via Multi-Relational Node Isolation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper thus introduces a novel approach, AutoSchA, which extends recent developments in graph neural networks (GNNs) for hierarchical music analysis. |
STEPHEN NI-HAHN et. al. | arxiv-cs.SD | 2025-12-20 |
| 24 | Do Foundational Audio Encoders Understand Music Structure? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although many open-source FAE models are available, only a small subset has been examined for MSA, and the impact of factors such as learning methods, training data, and model context length on MSA performance remains unclear. In this study, we conduct comprehensive experiments on 11 types of FAEs to investigate how these factors affect MSA performance. |
Keisuke Toyama; Zhi Zhong; Akira Takahashi; Shusuke Takahashi; Yuki Mitsufuji; | arxiv-cs.SD | 2025-12-18 |
| 25 | Teaching Critical Thinking Using Public Music Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a practical methodology for incorporating critical thinking and writing skills in music theory courses using online videos of public music theory. |
Jeremy Robins; | Engaging Students: Essays in Music Pedagogy | 2025-12-18 |
| 26 | WeMusic-Agent: Efficient Conversational Music Recommendation Via Knowledge Internalization and Agentic Boundary Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes WeMusic-Agent, a training framework for efficient LLM-based conversational music recommendation. |
WENDONG BI et. al. | arxiv-cs.AI | 2025-12-17 |
| 27 | MuseCPBench: An Empirical Study of Music Editing Methods Through Music Context Preservation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While some studies do consider MCP, they adopt inconsistent evaluation protocols and metrics, leading to unreliable and unfair comparisons. To address this gap, we introduce the first MCP evaluation benchmark, MuseCPBench, which covers four categories of musical facets and enables comprehensive comparisons across five representative music editing baselines. |
YASH VISHE et. al. | arxiv-cs.SD | 2025-12-16 |
| 28 | Web-Based Collaborative Music Lessons: Approaches, Challenges and Perspectives Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through a survey of instructor needs and platform evaluations, the study identifies key development priorities and highlights challenges such as latency, synchronization, and user experience. |
Chrisoula Alexandraki; Konstantinos Tsioutas; | Journal of the Audio Engineering Society | 2025-12-16 |
| 29 | Survey Paper on Music Recommendation System Using Facial Recognition and Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces an emotion-aware music recommendation system that adapts song suggestions in real time based on a user’s facial expressions. |
Namarta Gawande; | International Journal for Research in Applied Science and … | 2025-12-15 |
| 30 | The Renaissance of Expert Systems: Optical Recognition of Printed Chinese Jianpu Musical Scores with Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a modular expert-system pipeline that converts printed Jianpu scores with lyrics into machine-readable MusicXML and MIDI, without requiring massive annotated training data. |
FAN BU et. al. | arxiv-cs.CV | 2025-12-15 |
| 31 | Let The Model Learn to Feel: Mode-Guided Tonality Injection for Symbolic Music Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While these models effectively capture distributional musical semantics, they often overlook tonal structures, particularly musical modes, which play a critical role in emotional perception according to music psychology. In this paper, we investigate the representational capacity of MIDIBERT and identify its limitations in capturing mode-emotion associations. |
Haiying Xia; Zhongyi Huang; Yumei Tan; Shuxiang Song; | arxiv-cs.SD | 2025-12-14 |
| 32 | Procedural Music Generation Systems in Games Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through a comparative analysis, this study identifies key research challenges in algorithm implementation, music quality and game integration. |
Shangxuan Luo; Joshua Reiss; | arxiv-cs.SD | 2025-12-14 |
| 33 | AutoMV: An Automatic Multi-Agent System for Music Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose AutoMV, a multi-agent system that generates full music videos (MVs) directly from a song. |
XIAOXUAN TANG et. al. | arxiv-cs.MM | 2025-12-13 |
| 34 | PhraseVAE and PhraseLDM: Latent Diffusion for Full-Song Multitrack Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This technical report presents a new paradigm for full-song symbolic music generation. |
Longshen Ou; Ye Wang; | arxiv-cs.SD | 2025-12-12 |
| 35 | Reframing Music-Driven 2D Dance Pose Generation As Multi-Channel Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We address this by reframing music-to-dance generation as a music-token-conditioned multi-channel image synthesis problem: 2D pose sequences are encoded as one-hot images, compressed by a pretrained image VAE, and modeled with a DiT-style backbone, allowing us to inherit architectural and training advances from modern text-to-image models and better capture high-variance 2D pose distributions. On top of this formulation, we introduce (i) a time-shared temporal indexing scheme that explicitly synchronizes music tokens and pose latents over time and (ii) a reference-pose conditioning strategy that preserves subject-specific body proportions and on-screen scale while enabling long-horizon segment-and-stitch generation. |
YAN ZHANG et. al. | arxiv-cs.CV | 2025-12-12 |
| 36 | MUSIGAIN: Adaptive Graph Attention Network for Multi-Relationship Mining in Music Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The framework introduces three key innovations: (1) a layer-wise dynamic skipping mechanism that adaptively controls propagation depth based on third-order embedding stability, reducing computation by 30–40% while preventing over-smoothing; (2) the DiGRAF adaptive activation function that enables node-specific nonlinear transformations to capture semantic heterogeneity across different entity types; and (3) ranking-based optimization supervised by graph robustness metrics, focusing on relative importance ordering rather than absolute value prediction. |
Mian Chen; Tinghao Wang; Chunhao Li; Yuheng Li; | Electronics | 2025-12-12 |
| 37 | The Rise of Technology in Music Education: A Bibliometric Study of A Rapidly Growing Field Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The aim of this study is to examine publications on the use of technology in music education between 2016 and 2025 using bibliometric methods. |
Festa Nevzati Thaçi; | Architecture Image Studies | 2025-12-11 |
| 38 | MR-FlowDPO: Multi-Reward Direct Preference Optimization for Flow-Matching Text-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MR-FlowDPO, a novel approach that enhances flow-matching-based music generation models – a major class of modern music generative models, using Direct Preference Optimization (DPO) with multiple musical rewards. |
ALON ZIV et. al. | arxiv-cs.SD | 2025-12-10 |
| 39 | Text-to-Music Generation Using AI: Theoretical Foundations and Practical Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research explores the theoretical foundations linking language and music through semantic, emotional, and structural analysis, and demonstrates practical integration of AI music generation into software via APIs. |
Quang Minh Trinh; Thi Lan Ngo; Hue Trinh; Xuan Tung Bui; | Journal of Science and Technology | 2025-12-08 |
| 40 | PolyLingua: Margin-based Inter-class Transformer for Robust Cross-domain Language Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce PolyLingua, a lightweight Transformer-based model for in-domain language detection and fine-grained language classification. |
ALI LOTFI REZAABAD et. al. | arxiv-cs.LG | 2025-12-08 |
| 41 | Incorporating Structure and Chord Constraints in Symbolic Transformer-based Melodic Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper studies the inclusion of predefined chord constraints in melodic harmonization, i.e., where a desired chord at a specific location is provided along with the melody as inputs and the autoregressive transformer model needs to incorporate the chord in the harmonization that it generates. The peculiarities of involving such constraints is discussed and an algorithm is proposed for tackling this task. |
MAXIMOS KALIAKATSOS-PAPAKOSTAS et. al. | arxiv-cs.SD | 2025-12-08 |
| 42 | Morphologically-Informed Tokenizers for Languages with Non-Concatenative Morphology: A Case Study of Yoloxóchtil Mixtec ASR Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present two novel tokenization schemes that separate words in a nonlinear manner, preserving information about tonal morphology as much as possible. |
Chris Crawford; | arxiv-cs.CL | 2025-12-05 |
| 43 | Thomas Aquinas, Artificial Intelligence, and AI-Generated Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Rather than arguing AI music is either a technological innovation or artistic threat, this article suggests various frameworks of analogy, participation, and pneumatology to create a better theological discernment on how divine creativity works through secondary causes within creation. |
Chee Man Michael Tang; | New Blackfriars | 2025-12-05 |
| 44 | Lyrics Matter: Exploiting The Power of Learnt Representations for Music Popularity Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present an automated pipeline that uses LLM to extract high-dimensional lyric embeddings, capturing semantic, syntactic, and sequential information. |
Yash Choudhary; Preeti Rao; Pushpak Bhattacharyya; | arxiv-cs.SD | 2025-12-05 |
| 45 | ArtistMus: A Globally Diverse, Artist-Centric Benchmark for Retrieval-Augmented Music Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MusWikiDB, a vector database of 3.2M passages from 144K music-related Wikipedia pages, and ArtistMus, a benchmark of 1,000 questions on 500 diverse artists with metadata such as genre, debut year, and topic. |
Daeyong Kwon; SeungHeon Doh; Juhan Nam; | arxiv-cs.CL | 2025-12-05 |
| 46 | Who Will Top The Charts? Multimodal Music Popularity Prediction Via Adaptive Fusion of Modality Experts and Temporal Engagement Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods suffer from four limitations:(i) temporal dynamics in audio and lyrics are averaged away; (ii) lyrics are represented as a bag of words, disregarding compositional structure and affective semantics; (iii) artist- and song-level historical performance is ignored; and (iv) multimodal fusion approaches rely on simple feature concatenation, resulting in poorly aligned shared representations. To address these limitations, we introduce GAMENet, an end-to-end multimodal deep learning architecture for music popularity prediction. |
Yash Choudhary; Preeti Rao; Pushpak Bhattacharyya; | arxiv-cs.SD | 2025-12-05 |
| 47 | YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Singing Voice Synthesis (SVS) remains constrained in practical deployment due to its strong dependence on accurate phoneme-level alignment and manually annotated melody contours, requirements that are resource-intensive and hinder scalability. To overcome these limitations, we propose a melody-driven SVS framework capable of synthesizing arbitrary lyrics following any reference melody, without relying on phoneme-level alignment. |
JUNJIE ZHENG et. al. | arxiv-cs.SD | 2025-12-04 |
| 48 | Outward Threads – Intuitive Computers / Rational Composers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The project ‘Outward Threads’ is an artistic investigation rooted in music composition, integrating computational frameworks from machine learning and artificial intelligence to create new music. |
Juan Sebastián Vassallo; | KMD Artistic Research | 2025-12-03 |
| 49 | Wastewater Analyses for Psychoactive Substances at Music Festivals: A Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music festivals have emerged as venues for the consumption of recreational drugs and novel psychoactive substances. This systematic review provides the first critical evaluation and synthesis of published wastewater analyses for detecting recreational drug use at music festivals worldwide. |
Ringala Cainamisir; Xiao Zeng; Samuel B. Himmerich; Hubertus Himmerich; | Behavioral Sciences | 2025-12-03 |
| 50 | Interpretable Content-Based Music Genre Classification Utilizing A Modified Artificial Immune System with Binary Similarity Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work bridges computational intelligence and music analysis, offering a novel perspective on immune-inspired learning for content classification. |
Noor Azilah Muda; Choo Yun Huoy; Azah Kamilah Muda; | International Journal of Research and Innovation in Social … | 2025-12-03 |
| 51 | Perception of AI-Generated Music – The Role of Composer Identity, Personality Traits, Music Preferences, and Perceived Humanness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates how composer information and listener characteristics shape the perception of AI-generated music, adopting a mixed-method approach. |
David Stammer; Hannah Strauss; Peter Knees; | arxiv-cs.HC | 2025-12-02 |
| 52 | Elements of Music That Work to Improve Sleep, A Narrative Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Overall, this review highlights the elements at work that make music a safe, scalable, and culturally adaptable adjunct to traditional sleep therapies. |
Ethan Y. Pan; Wei Wang; | Frontiers in Sleep | 2025-12-02 |
| 53 | YingVideo-MV: Music-Driven Multi-Stage Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present YingVideo-MV, the first cascaded framework for music-driven long-video generation. |
JIAHUI CHEN et. al. | arxiv-cs.CV | 2025-12-02 |
| 54 | Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current methods like reward systems often rely on single numerical scores, struggle to capture various dimensions such as phrasing or expressiveness, and require costly annotations, limiting interpretability and generalization. To address these issues, we propose a generative feedback (i.e., reward model) framework that provides multi-dimensional language and audio feedback for SVS assessment. |
XUEYAN LI et. al. | arxiv-cs.SD | 2025-12-02 |
| 55 | Story2MIDI: Emotionally Aligned Music Generation from Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Story2MIDI, a sequence-to-sequence Transformer-based model for generating emotion-aligned music from a given piece of text. |
Mohammad Shokri; Alexandra C. Salem; Gabriel Levine; Johanna Devaney; Sarah Ita Levitan; | arxiv-cs.SD | 2025-12-01 |
| 56 | Melody or Machine: Detecting Synthetic Music with Dual-Stream Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: MoM is the most diverse dataset to date, built with a mix of open and closed-source models and a curated OOD test set designed specifically to foster the development of truly generalizable detectors. Alongside this benchmark, we introduce CLAM, a novel dual-stream detection architecture. |
ARNESH BATRA et. al. | arxiv-cs.SD | 2025-11-29 |
| 57 | Synthesizing Nostalgia: How An AI-Generated ‘Ejaje 1981’ Polish Hit Rewired Memory, Virality, and Copyrights Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using an interdisciplinary approach, we conducted (i) a musicological analysis of Ejaje 1981, (ii) legal and phonographic market perspective (iii) a sociological analysis of ~1200 social-media comments and sharing patterns across platforms. |
Andrzej Buda; Andrzej Jarynowski; | E-methodology | 2025-11-29 |
| 58 | An Intelligent Approach Toward Lyrics Text Classification Using Multilevel Cross Attention‐Based Adaptive BiLSTM With Relevant Feature Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, it is important to replace the standard text classification approach with other advanced methods to classify the different language lyrics effectively. To solve these issues, an efficient deep learning‐based lyrics text classification is developed in this work to classify the lyrics text. |
Jasmine Raja Lawrence; Saswati Mukherjee; C. R. Rene Robin; David Raj Gnanamuthu; | Computational Intelligence | 2025-11-28 |
| 59 | Artificial Intelligence in Music Generation: Models, Evaluation, Applications, and Future Prospects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a comprehensive review of the current state of AI music generation, covering the historical development of computer-assisted music production and AI-assisted music from early analog and digital tools to modern neural network architectures, and highlighting key developments such as MIDI, DAWs, plugins, and early algorithmic composition systems. |
Tiffany Chiu; | Theoretical and Natural Science | 2025-11-26 |
| 60 | DUO-TOK: Dual-Track Semantic Music Tokenizer for Vocal-Accompaniment Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Duo-Tok is a source-aware dual-codebook tokenizer for vocal-accompaniment music that targets the growing tension between reconstruction quality and language-model (LM) … |
RUI LIN et. al. | arxiv-cs.SD | 2025-11-25 |
| 61 | The Use of Tele-Music Interventions in Supportive Cancer Care: A Systematic Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Objectives: This systematic review seeks to provide an in-depth overview of current research on tele-music interventions in supportive cancer care and identifies key areas where further research is warranted. |
LORE MERTENS et. al. | Brain Sciences | 2025-11-25 |
| 62 | The Effect of The Typicality of Song Lyrics on Song Popularity: A Natural Language Processing Analysis of The British Top Singles Chart Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the relationship between the lexical typicality of song lyrics and song popularity in the UK Official Singles Chart from 1999 to 2013. |
Khaoula Chehbouni; Florian Carichon; Adrien Simonnot-Lanciaux; Gilles Caporossi; Danilo C. Dantas; | Psychology of Music | 2025-11-24 |
| 63 | Music-induced Physiological Markers for Detecting Alzheimer’s Disease Using Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Random Forest classifier distinguished AD patients from healthy controls with 70.5% accuracy, while the Naïve Bayes model predicted severity with 65.6% accuracy, demonstrating that ML models can detect subtle music-evoked physiological differences even in individuals with AD. |
Rodrigo Lima; Gonçalo Barradas; Sergi Bermúdez i Badia; | Frontiers in Aging Neuroscience | 2025-11-24 |
| 64 | Multidimensional Music Aesthetic Evaluation Via Semantically Consistent C-Mixup Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a robust music aesthetic evaluation framework that combines (1) multi-source multi-scale feature extraction to obtain complementary segment- and track-level representations, (2) a hierarchical audio augmentation strategy to enrich training data, and (3) a hybrid training objective that integrates regression and ranking losses for accurate scoring and reliable top-song identification. |
SHUYANG LIU et. al. | arxiv-cs.SD | 2025-11-24 |
| 65 | GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research contributes a valuable annotated dataset and provides insights for future work in Sinhala Natural Language Processing and music emotion recognition. |
Yomal De Mel; Nisansa de Silva; | arxiv-cs.CL | 2025-11-22 |
| 66 | MusicAIR: A Multimodal AI Music Generation Framework Powered By An Algorithm-Driven Core Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast, we propose MusicAIR, an innovative multimodal AI music generation framework powered by a novel algorithm-driven symbolic music core, effectively mitigating copyright infringement risks. |
Callie C. Liao; Duoduo Liao; Ellie L. Zhang; | arxiv-cs.SD | 2025-11-21 |
| 67 | Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel adversarial training method on policy-generated trajectories to mitigate reward hacking in RL post-training for melody-to-chord accompaniment. |
YUSONG WU et. al. | arxiv-cs.LG | 2025-11-21 |
| 68 | AI System for Music Generation Based on User Preferences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a lightweight, transparent, rule-based framework for affective melody generation coupled with a deterministic validation engine. |
MR. LANKE RAVI KUMAR et. al. | International Scientific Journal of Engineering and … | 2025-11-21 |
| 69 | Difficulty-Controlled Simplification of Piano Scores with Synthetic Data for Inclusive Music Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a transformer-based method for adjusting the difficulty of MusicXML piano scores. |
Pedro Ramoneda; Emilia Parada-Cabaleiro; Dasaem Jeong; Xavier Serra; | arxiv-cs.SD | 2025-11-20 |
| 70 | LargeSHS: A Large-scale Dataset of Music Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent advances in AI-based music generation have focused heavily on text-conditioned models, with less attention given to reference-based generation such as song adaptation. To support this line of research, we introduce LargeSHS, a large-scale dataset derived from SecondHandSongs, containing over 1.7 million metadata entries and approximately 900k publicly accessible audio links. |
Chih-Pin Tan; Hsuan-Kai Kao; Li Su; Yi-Hsuan Yang; | arxiv-cs.SD | 2025-11-19 |
| 71 | Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Incorporating different influences at different times, today it boasts a wide range of both autochthonous and imported genres, such as traditional folk music, rock, rap, pop, and manele, to name a few. We aim to trace the linguistic differences between the lyrics of these genres using natural language processing and a computational linguistics approach by studying the prosodic, stylistic, syntactic, and sentiment-based features of each genre. |
Erik-Robert Kovacs; Stefan Baghiu; | Big Data and Cognitive Computing | 2025-11-19 |
| 72 | HOW AI SHAPES MUSIC RECOMMENDATIONS AND CONSUMER PREFERENCES Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper is analysed using machine learning algorithms on different parameters like precision, recall and accuracy scores. |
Ashutosh Rastogi; | EPRA International Journal of Research & Development … | 2025-11-19 |
| 73 | Aligning Generative Music AI with Human Preferences: Methods and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We identify key research challenges including scalability to long-form compositions, reliability amongst others in preference modelling. |
Dorien Herremans; Abhinaba Roy; | arxiv-cs.SD | 2025-11-18 |
| 74 | MuCPT: Music-related Natural Language Model Continued Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the training side, we introduce reference-model (RM)-based token-level soft scoring for quality control: a unified loss-ratio criterion is used both for data selection and for dynamic down-weighting during optimization, reducing noise gradients and amplifying task-aligned signals, thereby enabling more effective music-domain continued pretraining and alignment. |
Kai Tian; Yirong Mao; Wendong Bi; Hanjie Wang; Que Wenhui; | arxiv-cs.CL | 2025-11-18 |
| 75 | A Controllable Perceptual Feature Generative Model for Melody Harmonization Via Conditional Variational Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Many studies have introduced emotion models to guide the generative process. |
Dengyun Huang; Yonghua Zhu; | arxiv-cs.SD | 2025-11-18 |
| 76 | A Deep Learning Approach for The Analysis of Birdsong Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We developed a novel deep learning-based behavior analysis pipeline, Avian Vocalization Network (AVN), for zebra finch learned vocalizations – the most widely studied vocal learning model species. |
Therese MI Koch; Ethan S Marks; Todd F Roberts; | eLife | 2025-11-18 |
| 77 | Applications of The MUSIC Model of Motivation and Its Associated Inventories: A Systematic Review and Meta-Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Searches were conducted across five databases (Education Research Complete, ERIC, PsycInfo, Web of Science, and ProQuest Dissertations & Theses Global), supplemented by Google Scholar and the MUSIC Model Research Lab website. Peer-reviewed journal articles and dissertations published between 2009 and 2022 were included if they employed all five components of the MUSIC model (eMpowerment, Usefulness, Success, Interest, and Caring) as part of their study framework or assessment procedure. |
Brett D. Jones; Zeynep Ambarkutuk; Dale H. Schunk; | International Journal of Applied Positive Psychology | 2025-11-18 |
| 78 | A Descriptive Literature Review of Zahara’s Plea for Africa’s Healing, Hope, and Answers in ‘Phendula: A Cry for God’s Intervention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores Zahara’s song Phendula as a powerful musical plea for Africa’s healing, hope, and divine intervention. |
Sakhiseni Joseph Yende; | International Journal of Research in Business and Social … | 2025-11-16 |
| 79 | Decoding Nature’s Melody: Significance and Challenges of Machine Learning in Assessing Bird Diversity Via Soundscape Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JIANGJIAN XIE et. al. | Artificial Intelligence Review | 2025-11-14 |
| 80 | PISA: Combining Transformers and ACT-R for Repeat-Aware Sequential Listening Session Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This presents a key challenge as repeatedly listening to the same song over time is a common habit that can shape how users experience and interpret this song. In this paper, we introduce PISA ( P sychology- I nformed S ession embedding using A CT-R), a session-level sequential recommender system designed to overcome this challenge. |
Viet Anh TRAN; Guillaume SALHA-GALVAN; Bruno SGUERRA; Romain HENNEQUIN; | ACM Transactions on Recommender Systems | 2025-11-14 |
| 81 | A.I. Artificial Intelligence (2001) As The Spiritual Swan Song of Stanley Kubrick Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article proposes a reading of A.I. Artificial Intelligence (2001) as the spiritual swan song of Stanley Kubrick, even though it was completed posthumously by Steven Spielberg. |
Alexandre Nascimento Braga Teixeira; | Arts | 2025-11-13 |
| 82 | Video Echoed in Music: Semantic, Temporal, and Rhythmic Alignment for Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the challenges, we propose Video Echoed in Music (VeM), a latent music diffusion that generates high-quality soundtracks with semantic, temporal, and rhythmic alignment for input videos. |
XINYI TONG et. al. | arxiv-cs.SD | 2025-11-12 |
| 83 | Chord-conditioned Melody and Bass Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate five Transformer-based strategies for chord-conditioned melody and bass generation using a set of music theory-motivated metrics capturing pitch content, pitch interval size, and chord tone usage. |
Alexandra C Salem; Mohammad Shokri; Johanna Devaney; | arxiv-cs.SD | 2025-11-11 |
| 84 | Melodia: Training-Free Music Editing Guided By Attention Probing in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast, self-attention maps are essential for preserving the temporal structure of the source music during its conversion into the target music. Building upon this understanding, we present Melodia, a training-free technique that selectively manipulates self-attention maps in particular layers during the denoising process and leverages an attention repository to store source music information, achieving accurate modification of musical characteristics while preserving the original structure without requiring textual descriptions of the source music. |
YI YANG et. al. | arxiv-cs.SD | 2025-11-11 |
| 85 | Multisensory Interactive Music in Extended Reality: A Comprehensive Investigation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the emerging field of multisensory interactive music within Extended Reality (XR), focusing on the convergence of spatialized audio, embodied interaction, reactive visuals, narrative coherence, as well as emotional resonance. |
Liying Huang; | Communications in Humanities Research | 2025-11-11 |
| 86 | MPJudge: Towards Perceptual Assessment of Music-Induced Paintings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods primarily rely on emotion recognition modelsto assess the similarity between music and painting, but such models introduceconsiderable noise and overlook broader perceptual cues beyond emotion. Toaddress these limitations, we propose a novel framework for music inducedpainting assessment that directly models perceptual coherence between music andvisual art. |
Shiqi Jiang; Tianyi Liang; Changbo Wang; Chenhui Li; | arxiv-cs.CV | 2025-11-10 |
| 87 | Who Gets Heard? Rethinking Fairness in AI for Music Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Suchharms risk reinforcing biases, limiting creativity, and contributing tocultural erasure. To address this, we offer recommendations at dataset, modeland interface level in music-AI systems. |
ATHARVA MEHTA et. al. | arxiv-cs.CY | 2025-11-08 |
| 88 | Robust Neural Audio Fingerprinting Using Music Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We systematically evaluate our methods in comparison to twostate-of-the-art neural fingerprinting models: NAFP and GraFPrint. |
SHUBHR SINGH et. al. | arxiv-cs.SD | 2025-11-07 |
| 89 | BNMusic: Blending Environmental Noises Into Personalized Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, misalignment between the dominant sound and the noise—such as mismatched downbeats—often requires an excessive volume increase to achieve effective masking. Motivated by recent advances in cross-modal generation, in this work, we introduce an alternative method to acoustic masking, aiming to reduce the noticeability of environmental noises by blending them into personalized music generated based on user-provided text prompts. |
CHI ZUO et. al. | nips | 2025-11-07 |
| 90 | Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a unified framework for automatic multitrack music arrangement that enables a single pre-trained symbolic music model to handle diverse arrangement scenarios, including reinterpretation, simplification, and additive generation. |
LONGSHEN OU et. al. | nips | 2025-11-07 |
| 91 | SongBloom: Coherent Song Generation Via Interleaved Autoregressive Sketching and Diffusion Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces SongBloom, a novel framework for full-length song generation that leverages an interleaved paradigm of autoregressive sketching and diffusion-based refinement. |
CHENYU YANG et. al. | nips | 2025-11-07 |
| 92 | LeVo: High-Quality Song Generation with Multi-Preference Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches still struggle with the complex composition of songs and the scarcity of high-quality data, leading to limitations in audio quality, musicality, instruction following, and vocal-instrument harmony. To address these challenges, we introduce LeVo, a language model based framework consisting of LeLM and Music Codec. |
SHUN LEI et. al. | nips | 2025-11-07 |
| 93 | Persian Musical Instruments Classification Using Polyphonic Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a culturally informed data augmentationstrategy that generates realistic polyphonic mixtures from monophonic samples.Using the MERT model (Music undERstanding with large-scale self-supervisedTraining) with a classification head, we evaluate our approach without-of-distribution data which was obtained by manually labeling segments oftraditional songs. |
Diba Hadi Esfangereh; Mohammad Hossein Sameti; Sepehr Harfi Moridani; Leili Javidpour; Mahdieh Soleymani Baghshah; | arxiv-cs.SD | 2025-11-07 |
| 94 | Audio Super-Resolution with Latent Bridge Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Towards high-quality audio super-resolution, we present a new system with latent bridge models (LBMs), where we compress the audio waveform into a continuous latent space and design an LBM to enable a latent-to-latent generation process that naturally matches the LR-to-HR upsampling process, thereby fully exploiting the instructive prior information contained in the LR waveform. |
Chang Li; Zehua Chen; Liyuan Wang; Jun Zhu; | nips | 2025-11-07 |
| 95 | MusRec: Zero-Shot Text-to-Music Editing Via Rectified Flow and Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Leveraging recent advances in rectified flow anddiffusion transformers, we introduce MusRec, the first zero-shot text-to-musicediting model capable of performing diverse editing tasks on real-world musicefficiently and effectively. |
Ali Boudaghi; Hadi Zare; | arxiv-cs.SD | 2025-11-06 |
| 96 | EMO100DB: An Open Dataset of Improvised Songs with Emotion Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce Emo100DB: a dataset consisting of improvisedsongs that were recorded and transcribed with emotion data based on Russell’scircumplex model of emotion. |
Daeun Hwang; Saebyul Park; | arxiv-cs.SD | 2025-11-06 |
| 97 | Sing & Spell with AI: Enhancing Vocabulary Acquisition Through Song-Based Learning in ESL Classrooms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Sing & Spell with AI is an innovation that combines Artificial Intelligence with music-based learning to support vocabulary instruction in ESL primary classrooms. The tool … |
Mohammad Radzi bin Manap; Nor Fazlin binti Mohd Ramli; Siti Nur Badriah binti Mohd Tahir; | International Journal of Research and Innovation in Social … | 2025-11-05 |
| 98 | Harmony of Language and Technology: AI-Supported Music Education Through Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the integration of artificial intelligence (AI) in music education as an innovative approach to enhance language learning through song lyrics. |
Juriani Jamaludin; Nurulhamimi Abdul Rahman; Enrieka Ervina Ugama; Raudhatul Jannah Mohd Shahril; Brenda Louisa Bede; | International Journal of Research and Innovation in Social … | 2025-11-05 |
| 99 | A Systematic Review of Music Therapy for Symptoms of Traumatic Brain Injury and Posttraumatic Stress Disorder in Adults Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music therapy (MT) is implemented in healthcare settings for a range of symptoms and conditions. This systematic review provides an update of the available evidence by searching a combination of keywords related to MT, traumatic brain injury (TBI), and posttraumatic stress disorder (PTSD) across four databases (PubMed, PsycINFO, PTSDpubs, Scopus) from inception to February 2025. |
JAY M. UOMOTO et. al. | NeuroRehabilitation | 2025-11-04 |
| 100 | Effect of Music Therapy on Blood Pressure and Quality of Life Among Individuals with Essential Hypertension: A Systematic Review and Meta-analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To investigate the effects of music therapy on blood pressure levels, negative emotions, and quality of life in patients with essential hypertension, a systematic review and meta-analysis based on Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines were performed in the present study. |
Zewen Li; Yi Zhang; | Well-Being Sciences Review | 2025-11-03 |
| 101 | WildScore: Benchmarking MLLMs In-the-Wild Symbolic Music Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To facilitate a comprehensive evaluation, we propose a systematic taxonomy,comprising both high-level and fine-grained musicological ontologies. |
GAGAN MUNDADA et. al. | emnlp | 2025-11-02 |
| 102 | ToneCraft: Cantonese Lyrics Generation with Harmony of Tones and Pitches Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current research has not yet addressed the challenge of generating lyrics that adhere to Cantonese harmony rules. To tackle this issue, we propose ToneCraft, a novel framework for generating Cantonese lyrics that ensures tonal and melodic harmony. |
Junyu Cheng; Chang Pan; Shuangyin Li; | emnlp | 2025-11-02 |
| 103 | Factual and Musical Evaluation Metrics for Music Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To measure the true performance of Music LMs,we propose (1) a better general-purpose evaluation metric for Music LMs adaptedto the music domain and (2) a factual evaluation framework to quantify thecorrectness of a Music LM’s responses. |
Daniel Chenyu Lin; Michael Freeman; John Thickstun; | arxiv-cs.SD | 2025-11-02 |
| 104 | DeepResonance: Enhancing Multimodal Music Understanding Via Music-centric Multi-way Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the potential of incorporating additional modalities such as images, videos and textual music features to enhance music understanding remains unexplored. To bridge this gap, we propose DeepResonance, a multimodal music understanding LLM fine-tuned via multi-way instruction tuning with multi-way aligned music, text, image, and video data. |
Zhuoyuan Mao; Mengjie Zhao; Qiyu Wu; Hiromi Wakaki; Yuki Mitsufuji; | emnlp | 2025-11-02 |
| 105 | Exploring The Correlation Between The Type of Music and The Emotions Evoked: A Study Using Subjective Questionnaires and EEG Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The subject of this work is to check how different types of music affecthuman emotions. |
Jelizaveta Jankowska; Bożena Kostek; Fernando Alonso-Fernandez; Prayag Tiwari; | arxiv-cs.CV | 2025-10-30 |
| 106 | Whose Intelligence? Whose Music? Critical Reflections on AI in Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This commentary critically examines the integration of artificial intelligence (AI) into music education through two guiding questions: Whose intelligence is encoded within these … |
Jincheng Ma; Qiang Wan; | Education as Change | 2025-10-30 |
| 107 | Artificial Intelligence and Healing Education: Bibliotherapy and Musicotherapy in Primary Schooling – An Innovative Theoretical Model of Bibliotherapy and Musicotherapy Questionnaire (BMQ) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study examines the use of the Donna AI Song Generator within healing education, aiming to identify optimal strategies for both teachers and pupils. |
Mirzana Pašić Kodrić; Merima čaušević; | Online Journal of Music Sciences | 2025-10-29 |
| 108 | Utilising Generative AI to Assist in The Creation and Production of Chinese Popular Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Objectives: This study introduces the GenAI Melody-LSTM algorithm. |
Xinhao Li; Hyuntai Kim; | Online Journal of Music Sciences | 2025-10-29 |
| 109 | Artificial Intelligence in Music: Applications, Challenges, and Future Prospects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In particular, further research may focus on developing more transparent algorithms to improve user trust, exploring hybrid systems that integrate human creativity with machine intelligence, and establishing clearer frameworks for copyright and ownership of AI-generated works. By providing this structured overview, the review seeks to promote a deeper understanding of AIs potential as a collaborative tool in reshaping the future of music. |
Yichen Zhu; | Communications in Humanities Research | 2025-10-28 |
| 110 | Streaming Generation for Music Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a model design considering inevitablesystem delays in practical deployment with two design variables: futurevisibility $t_f$, the offset between the output playback time and the latestinput time used for conditioning, and output chunk duration $k$, the number offrames emitted per call. |
YUSONG WU et. al. | arxiv-cs.SD | 2025-10-24 |
| 111 | StylePitcher: Generating Style-Following and Expressive Pitch Curves for Versatile Singing Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Existing pitch curve generators face two main challenges: they often neglectsinger-specific expressiveness, reducing their ability to capture individualsinging styles. And they … |
JINGYUE HUANG et. al. | arxiv-cs.SD | 2025-10-24 |
| 112 | Effects of Music Intervention in Patients with Burn Injury: A Systematic Review and Meta-analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Hao Wang; Riti Qiu; Hua Huang; Bin Chen; | Medicine | 2025-10-24 |
| 113 | A Backpropagation Neural Network Model with Adaptive Feature Extraction for Music Emotion Recognition in Online Music Appreciation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the relationship between students’ emotions and music categories in online music appreciation courses. |
Yang Chen; Chang Gao; Sahin Akdag; | PeerJ Computer Science | 2025-10-23 |
| 114 | Evaluation of Music Generated By Artificial Intelligence from A Compositional Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study compares AI-generated music and human composition in terms of aesthetic value, originality, and coherence. |
Selin Oyan Küpeli; | ARTS: Artuklu Sanat ve Beşeri Bilimler Dergisi | 2025-10-22 |
| 115 | LyriCAR: A Difficulty-Aware Curriculum Reinforcement Learning Framework For Controllable Lyric Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work,we propose LyriCAR, a novel framework for controllable lyric translation thatoperates in a fully unsupervised manner. |
Le Ren; Xiangjian Zeng; Qingqiang Wu; Ruoxuan Liang; | arxiv-cs.CL | 2025-10-22 |
| 116 | Emotion Recognition in Javanese Music: A Comparative Study of Classifier Models with A Human-Annotated Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the effectiveness of three well-established machine learning models, 1D Convolutional Neural Networks (1D-CNNs), support Vector Machines (SVMs), and XGBoost, in classifying emotions in Javanese music using a manually annotated dataset. |
Moh Erwin Septianto; Ariana Tulus Purnomo; Ding Bing Lin; Chang Soo Kim; | Indonesian Journal of Computing, Engineering, and Design … | 2025-10-22 |
| 117 | Tug-of-War of Emotion: Measuring and Modeling Sentiment Cycles in Chinese-Language Pop Song Lyrics, 1967-2023 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For example, the detected monotone downward trend in the sentiment of English-language pop lyrics is typically interpreted as “reflecting” the deteriorating emotional and mental state in listener populations and/or the increasing demand for more negative (or less positive) lyric sentiment. This study challenges this “mirror interpretation” with an alternative “equilibration interpretation,” which posits that the average listener sentiment preference may remain largely stable across decades, and it is the equilibrating process that either brings the sentiment of pop lyrics closer to the listener preference or make the lyric sentiment oscillate around the listener preference. |
Xiaolu Wang; | Journal of Cultural Analytics | 2025-10-21 |
| 118 | SegTune: Structured and Fine-Grained Control for Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thispaper, we propose SegTune, a non-autoregressive framework for structured andcontrollable song generation. |
PENGFEI CAI et. al. | arxiv-cs.SD | 2025-10-21 |
| 119 | MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Multimodal DuetDance (MDD), a diverse multimodal benchmark dataset designed for text-controlled and music-conditioned 3D duet dance motion generation. |
Prerit Gupta; Jason Alexander Fotso-Puepi; Zhengyuan Li; Jay Mehta; Aniket Bera; | iccv | 2025-10-20 |
| 120 | Music-Aligned Holistic 3D Dance Generation Via Hierarchical Motion Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these challenges, we introduce SoulDance, a high-precision music-dance paired dataset captured via professional motion capture systems, featuring meticulously annotated holistic dance movements. Building on this dataset, we propose SoulNet, a framework designed to generate music-aligned, kinematically coordinated holistic dance sequences. |
XIAOJIE LI et. al. | iccv | 2025-10-20 |
| 121 | Not All Deepfakes Are Created Equal: Triaging Audio Forgeries for Robust Deepfake Singer Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on the premise that the mostharmful deepfakes are those of the highest quality, we introduce a two-stagepipeline to identify a singer’s vocal likeness. |
Davide Salvi; Hendrik Vincent Koops; Elio Quinton; | arxiv-cs.SD | 2025-10-20 |
| 122 | Music Grounding By Short Video Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to bridge the gap between the practical need for music moment localization and V2MR, we propose a new task termed Music Grounding by Short Video (MGSV). |
ZIJIE XIN et. al. | iccv | 2025-10-20 |
| 123 | MuseTok: Symbolic Music Tokenization for Generation and Semantic Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Discrete representation learning has shown promising results across variousdomains, including generation and understanding in image, speech and language.Inspired by these advances, we propose MuseTok, a tokenization method forsymbolic music, and investigate its effectiveness in both music generation andunderstanding tasks. |
JINGYUE HUANG et. al. | arxiv-cs.SD | 2025-10-17 |
| 124 | Application of Time GAN-LSTM Algorithm in Constructing Music Aesthetic Classification Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes an algorithm that fuses temporal generative adversarial network (Time GAN) and long short-term memory network (LSTM), which is applied to construct a music aesthetic classification model in order to more accurately identify and classify music works. |
Feifei Li; Changyue Yu; | Journal of Computational Methods in Sciences and Engineering | 2025-10-17 |
| 125 | Automatic Generation of Music Education Content Based on Deep Learning Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes an innovative approach to improve the quality and efficiency of music education using artificial intelligence technology. |
Lijuan Yu; | Journal of Computational Methods in Sciences and Engineering | 2025-10-17 |
| 126 | The Virtual Concert-goer: Audience Perspectives on Remote Music Performances Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our work explores how audiences perceive and engage with remote music events. |
SOPHIA PPALI et. al. | Proceedings of the ACM on Human-Computer Interaction | 2025-10-16 |
| 127 | Phrase-Oriented Generative Rhythmic Patterns for Jazz Solos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a novel generative approach for crafting phrase-oriented rhythmic patterns in jazz solos, leveraging statistical analyses of a comprehensive corpus, the Weimar Jazz Database. |
Adriano N. Raposo; Vasco N. G. J. Soares; | Applied Sciences | 2025-10-15 |
| 128 | MotionBeat: Motion-Aligned Music Representation Via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We proposeMotionBeat, a framework for motion-aligned music representation learning.MotionBeat is trained with two newly proposed objectives: the EmbodiedContrastive Loss (ECL), an enhanced InfoNCE formulation with tempo-aware andbeat-jitter negatives to achieve fine-grained rhythmic discrimination, and theStructural Rhythm Alignment Loss (SRAL), which ensures rhythm consistency byaligning music accents with corresponding motion events. |
Xuanchen Wang; Heng Wang; Weidong Cai; | arxiv-cs.SD | 2025-10-15 |
| 129 | Automatic Generation of Music Elements Based on Artificial Intelligence Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The designed method can effectively achieve music production, meet high precision design requirements, and achieve good design results. This indicates that the music element generation method based on recurrent gradient frequency proposed in the study has good performance. |
Sha Li; | Journal of Computational Methods in Sciences and Engineering | 2025-10-14 |
| 130 | Music Genre Classification with Modified Residual Learning and Dual Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes an approach to improve the music genre classification tasks with modified residual learning and hybrid convolutional neural networks. |
MOHSIN ASHRAF et. al. | PLOS One | 2025-10-14 |
| 131 | Uncovering Hidden Themes in Indie Music: Crisp-Dm Guided LDA Topic Modeling on A Kaggle-Based Lyric Generation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For future research, it is recommended to use larger datasets and more diverse interpretations and apply more machine learning models. |
Thoyyibah T; Yan Mitha Djaksana; | JURNAL TEKNIK INFORMATIKA | 2025-10-13 |
| 132 | Enhanced Television Broadcast Monitoring with Source Separation-assisted Audio Fingerprinting: A Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the first extensive study comprising 13 source separation algorithms and five AFP models. |
Guillem Cortès-Sebastià; Marius Miron; Emilio Molina; Alex Ciurana; Xavier Serra; | Multimedia Tools and Applications | 2025-10-13 |
| 133 | Content-based Music Recommender System Based on Music Emotion Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Mohammad Ali Talaghat; Elham Parvinnia; Reza Boostani; | Iran Journal of Computer Science | 2025-10-09 |
| 134 | Writing The Future: The Impact of Artificial Intelligence and Knowledge Graphs on The Music Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The objective of the study is to conceptualize and justify a hybrid model that unites the deterministic precision of knowledge graph structures with the timbral and textural richness of neural-network generative approaches. |
Popova Anastasiia; | Universal Library of Arts and Humanities | 2025-10-09 |
| 135 | Segment-Factorized Full-Song Generation on Symbolic Piano Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose the Segmented Full-Song Model (SFS) for symbolic full-songgeneration. |
Ping-Yi Chen; Chih-Pin Tan; Yi-Hsuan Yang; | arxiv-cs.SD | 2025-10-07 |
| 136 | LARA-Gen: Enabling Continuous Emotion Control for Music Generation Models Via Latent Affective Representation Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce LARA-Gen, a framework for continuous emotion controlthat aligns the internal hidden states with an external music understandingmodel through Latent Affective Representation Alignment (LARA), enablingeffective training. |
JIAHAO MEI et. al. | arxiv-cs.SD | 2025-10-07 |
| 137 | Transcribing Rhythmic Patterns of The Guitar Track in Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To transcribe the strums and their corresponding rhythmicpatterns, we propose a three-step framework. |
Aleksandr Lukoianov; Anssi Klapuri; | arxiv-cs.SD | 2025-10-07 |
| 138 | Language Models for Longitudinal Analysis of Abusive Content in Billboard Music Charts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we utilise deep learning methods to analyse songs(lyrics) from Billboard Charts of the United States in the last seven decades.We provide a longitudinal study using deep learning and language models andreview the evolution of content using sentiment analysis and abuse detection,including sexually explicit content. |
Rohitash Chandra; Yathin Suresh; Divyansh Raj Sinha; Sanchit Jindal; | arxiv-cs.CL | 2025-10-05 |
| 139 | Large Model‐Enhanced CNN –Transformer Architecture for Adaptive Music Quality Classification in Wireless Communication Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: ABSTRACT This letter presents an enhanced convolutional neural network (CNN)–transformer architecture integrated with large model (LM) capabilities for adaptive music quality classification in wireless communication networks (WCNs). |
Tianyu Chen; | Internet Technology Letters | 2025-10-04 |
| 140 | Detecting Notational Errors in Digital Music Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, data quality isa major issue when dealing with musical information extraction and retrieval.We present an automated approach to detect notational errors, aiming atprecisely localizing defects in scores. |
Géré Léo; Nicolas Audebert; Florent Jacquemard; | arxiv-cs.MM | 2025-10-03 |
| 141 | Go WitheFlow: Real-time Emotion Driven Audio Effects Modulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, weintroduce the witheFlow system, designed to enhance real-time music performanceby automatically modulating audio effects based on features extracted from bothbiosignals and the audio itself. |
Edmund Dervakos; Spyridon Kantarelis; Vassilis Lyberatos; Jason Liartis; Giorgos Stamou; | arxiv-cs.SD | 2025-10-02 |
| 142 | Bias Beyond Borders: Global Inequalities in AI-Generated Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thesechallenges, we introduce GlobalDISCO, a large-scale dataset consisting of 73kmusic tracks generated by state-of-the-art commercial generative music models,along with paired links to 93k reference tracks in LAION-DISCO-12M. |
Ahmet Solak; Florian Grötschla; Luca A. Lanzendörfer; Roger Wattenhofer; | arxiv-cs.SD | 2025-10-02 |
| 143 | TalkPlay-Tools: Conversational Music Recommendation with LLM Tool Calling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an LLM-based music recommendation system with toolcalling to serve as a unified retrieval-reranking pipeline. |
Seungheon Doh; Keunwoo Choi; Juhan Nam; | arxiv-cs.IR | 2025-10-02 |
| 144 | VMM: Video-Music Mamba for Generating Background Music from Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jiajun Xu; Zixiang Lu; Ping Gao; Qiguang Miao; Kun Xie; | Comput. Vis. Image Underst. | 2025-10-01 |
| 145 | Source Separation for A Cappella Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the task of multi-singer separation in a cappellamusic, where the number of active singers varies across mixtures. |
Luca A. Lanzendörfer; Constantin Pinkl; Florian Grötschla; | arxiv-cs.SD | 2025-09-30 |
| 146 | Data Melodification FM: Where Musical Rhetoric Meets Sonification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a design space for data melodification, where standardvisualization idioms and fundamental data characteristics map to rhetoricaldevices of music for a more affective experience of data. |
Ke Er Amy Zhang; David Grellscheid; Laura Garrison; | arxiv-cs.HC | 2025-09-30 |
| 147 | SAGE-Music: Low-Latency Symbolic Music Generation Via Attribute-Specialized Key-Value Head Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our main contributions are (1) thefirst systematic study of BPE’s generalizability in multi-track symbolic music,and (2) the introduction of AS-KVHS for low-latency symbolic music generation.Beyond these, we also release SAGE-Music, an open-source benchmark that matchesor surpasses state-of-the-art models in generation quality. |
JIAYE TAN et. al. | arxiv-cs.SD | 2025-09-30 |
| 148 | MUSE-Explainer: Counterfactual Explanations for Symbolic Music Graph Classification Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Interpretability is essential for deploying deep learning models in symbolicmusic analysis, yet most research emphasizes model performance overexplanation. To address this, we introduce MUSE-Explainer, a new method thathelps reveal how music Graph Neural Network models make decisions by providingclear, human-friendly explanations. |
Baptiste Hilaire; Emmanouil Karystinaios; Gerhard Widmer; | arxiv-cs.SD | 2025-09-30 |
| 149 | The Shape of Surprise: Structured Uncertainty and Co-Creativity in AI Music Tools Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thispaper presents a thematic review of contemporary AI music systems, examininghow designers incorporate randomness and uncertainty into creative practice. |
Eric Browne; | arxiv-cs.SD | 2025-09-29 |
| 150 | Ethics Statements in AI Music Papers: The Effective and The Ineffective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we conduct a review of ethicsstatements across ISMIR, NIME, and selected prominent works in AI music fromthe past five years. |
Julia Barnett; Patrick O’Reilly; Jason Brent Smith; Annie Chu; Bryan Pardo; | arxiv-cs.CY | 2025-09-29 |
| 151 | Discovering Words in Music: Unsupervised Learning of Compositional Sparse Code for Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an unsupervised machine learning algorithm thatidentifies recurring patterns — referred to as “music-words” — fromsymbolic music data. |
TIANLE WANG et. al. | arxiv-cs.SD | 2025-09-29 |
| 152 | Beyond Genre: Diagnosing Bias in Music Embeddings Using Concept Activation Vectors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we apply Concept Activation Vectors(CAVs) to investigate whether non-musical singer attributes – such as genderand language – influence genre representations in unintended ways. |
Roman B. Gebhardt; Arne Kuhle; Eylül Bektur; | arxiv-cs.SD | 2025-09-29 |
| 153 | Echoes of Humanity: Exploring The Perceived Humanness of AI Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present results from alistener-focused experiment aimed at understanding how humans perceive AIM. |
FLAVIO FIGUEIREDO et. al. | arxiv-cs.AI | 2025-09-29 |
| 154 | ABC-Eval: Benchmarking Large Language Models on Symbolic Music Understanding and Instruction Following Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While symbolic music has been widely used in generation tasks, LLMcapabilities in understanding and reasoning about symbolic music remain largelyunderexplored. To address this gap, we propose ABC-Eval, the first open-sourcebenchmark dedicated to the understanding and instruction-following capabilitiesin text-based ABC notation scores. |
Jiahao Zhao; Yunjia Li; Wei Li; Kazuyoshi Yoshii; | arxiv-cs.SD | 2025-09-27 |
| 155 | Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Even methods based on emotion mappingface controversy, as emotion represents only a singular aspect of art.Additionally, most learning-based methods require substantial computationalresources and large datasets for training, hindering accessibility for commonusers. To address these challenges, we propose the first Vision Language Model(VLM)-based I2M framework that offers high interpretability and lowcomputational cost. |
Zijian Zhao; Dian Jin; Zijing Zhou; | arxiv-cs.SD | 2025-09-26 |
| 156 | MusicWeaver: Coherent Long-Range and Editable Music Generation from A Beat-Aligned Structural Plan Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MusicWeaver, a music generationmodel conditioned on a beat-aligned structural plan. |
Xuanchen Wang; Heng Wang; Weidong Cai; | arxiv-cs.SD | 2025-09-25 |
| 157 | SiNGER: A Clearer Voice Distills Vision Transformers Further Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Prior work attempted to remove artifacts but encountered aninherent trade-off between artifact suppression and preserving informativesignals from teachers. To address this, we introduce Singular Nullspace-GuidedEnergy Reallocation (SiNGER), a novel distillation framework that suppressesartifacts while preserving informative signals. |
Geunhyeok Yu; Sunjae Jeong; Yoonyoung Choi; Jaeseung Kim; Hyoseok Hwang; | arxiv-cs.CV | 2025-09-25 |
| 158 | Muse-it: A Tool for Analyzing Music Discourse on Reddit Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Muse-it, aplatform that retrieves comprehensive Reddit data centered on user-definedqueries. |
Jatin Agarwala; George Paul; Nemani Harsha Vardhan; Vinoo Alluri; | arxiv-cs.IR | 2025-09-24 |
| 159 | CoMelSinger: Discrete Token-Based Zero-Shot Singing Synthesis With Structured Melody Control and Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present CoMelSinger, a zero-shot SVS frameworkthat enables structured and disentangled melody control within a discrete codecmodeling paradigm. |
Junchuan Zhao; Wei Zeng; Tianle Lyu; Ye Wang; | arxiv-cs.SD | 2025-09-24 |
| 160 | SINGER: An Onboard Generalist Vision-Language Navigation Policy for Drones Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present SINGER for language-guidedautonomous drone navigation in the open world using only onboard sensing andcompute. |
Maximilian Adang; JunEn Low; Ola Shorinwa; Mac Schwager; | arxiv-cs.RO | 2025-09-22 |
| 161 | Dorabella Cipher As Musical Inspiration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We weigh the evidence for and againstthe hypothesis, devise a simplified music notation, and attempt to reconstructa melody from the cipher. |
Bradley Hauer; Colin Choi; Abram Hindle; Scott Smallwood; Grzegorz Kondrak; | arxiv-cs.CL | 2025-09-22 |
| 162 | RISE: Adaptive Music Playback for Realtime Intensity Synchronization with Exercise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a system to adapt a user’s music to their exercise by aligninghigh-energy music segments with intense intervals of the workout. |
Alexander Wang; Chris Donahue; Dhruv Jain; | arxiv-cs.SD | 2025-09-21 |
| 163 | Etude: Piano Cover Generation with A Three-Stage Approach — Extract, StrucTUralize, and DEcode Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Etude, a three-stage architecture consisting ofExtract, strucTUralize, and DEcode stages. |
Tse-Yang Che; Yuh-Jzer Joung; | arxiv-cs.SD | 2025-09-20 |
| 164 | The Singing Voice Conversion Challenge 2025: From Singer Identity Conversion To Singing Style Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the findings of the latest iteration of the Singing VoiceConversion Challenge, a scientific event aiming to compare and understanddifferent voice conversion systems in a controlled environment. |
LESTER PHILLIP VIOLETA et. al. | arxiv-cs.SD | 2025-09-19 |
| 165 | Jamendo-QA: A Large-Scale Music Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Jamendo-QA, a large-scale dataset for Music Question Answering(Music-QA). |
Junyoung Koh; Soo Yong Kim; Yongwon Choi; Gyu Hyeong Choi; | arxiv-cs.MM | 2025-09-19 |
| 166 | Music4All A+A: A Multimodal Dataset for Music Information Retrieval Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most datasets for multimodal MIR neglectthis aspect and provide data at the level of individual music tracks. We aim tofill this gap by providing Music4All Artist and Album (Music4All A+A), adataset for multimodal MIR tasks based on music artists and albums. |
Jonas Geiger; Marta Moscati; Shah Nawaz; Markus Schedl; | arxiv-cs.MM | 2025-09-18 |
| 167 | AnyAccomp: Generalizable Accompaniment Generation Via Quantized Melodic Bottleneck Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This creates a critical train-test mismatch, leading to failure onclean, real-world vocal inputs. We introduce AnyAccomp, a framework thatresolves this by decoupling accompaniment generation from source-dependentartifacts. |
Junan Zhang; Yunjia Zhang; Xueyao Zhang; Zhizheng Wu; | arxiv-cs.SD | 2025-09-17 |
| 168 | Osu2MIR: Beat Tracking Dataset Derived From Osu! Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore the use of Osu! |
Ziyun Liu; Chris Donahue; | arxiv-cs.SD | 2025-09-16 |
| 169 | Data-Driven Analysis of Text-Conditioned AI-Generated Music: A Case Study with Suno and Udio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Usinga combination of state-of-the-art text embedding models, dimensionalityreduction and clustering methods, we analyze the prompts, tags and lyrics, andautomatically annotate and display the processed data in interactive plots. |
Luca Casini; Laura Cros Vila; David Dalmazzo; Anna-Kaisa Kaila; Bob L. T. Sturm; | arxiv-cs.IR | 2025-09-15 |
| 170 | Socially Aware Music Recommendation: A Multi-Modal Graph Neural Networks for Collaborative Music Consumption and Community-Based Engagement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents a novel Multi-Modal Graph Neural Network (MM-GNN)framework for socially aware music recommendation, designed to enhancepersonalization and foster community-based engagement. |
Kajwan Ziaoddini; | arxiv-cs.IR | 2025-09-12 |
| 171 | An Adaptive CMSA for Solving The Longest Filled Common Subsequence Problem with An Application in Audio Querying Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a new benchmark dataset withsignificantly larger instances and demonstrate that existing datasets lack thediscriminative power needed to meaningfully assess algorithm performance atscale. |
Marko Djukanovic; Christian Blum; Aleksandar Kartelj; Ana Nikolikj; Guenther Raidl; | arxiv-cs.SD | 2025-09-12 |
| 172 | Real-world Music Plagiarism Detection With Music Segment Transcription System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, wepropose a system for detecting music plagiarism by combining various MIRtechnologies. |
Seonghyeon Go; | arxiv-cs.AI | 2025-09-10 |
| 173 | Segment Transformer: AI-Generated Music Detection Via Music Structural Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Also, it can be difficult todetermine whether a piece was generated by AI or composed by humans clearly. Toaddress these challenges, we aim to improve the accuracy of AIGM detection byanalyzing the structural patterns of music segments. |
Yumin Kim; Seonghyeon Go; | arxiv-cs.SD | 2025-09-10 |
| 174 | No Encore: Unlearning As Opt-Out in Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present preliminary results on the firstapplication of machine unlearning techniques from an ongoing research toprevent inadvertent usage of creative content. |
Jinju Kim; Taehan Kim; Abdul Waheed; Jong Hwan; Rita Singh; | arxiv-cs.CL | 2025-09-07 |
| 175 | From Joy to Fear: A Benchmark of Emotion Estimation in Pop Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Amanually labeled dataset is constructed using a mean opinion score (MOS)approach, which aggregates annotations from multiple human raters to ensurereliable ground-truth labels. Leveraging this dataset, we conduct acomprehensive evaluation of several publicly available large language models(LLMs) under zero-shot scenarios. |
Shay Dahary; Avi Edana; Alexander Apartsin; Yehudit Aperstein; | arxiv-cs.CL | 2025-09-06 |
| 176 | Training A Perceptual Model for Evaluating Auditory Similarity in Music Adversarial Attack Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music Information Retrieval (MIR) systems are highly vulnerable toadversarial attacks that are often imperceptible to humans, primarily due to amisalignment between model feature spaces and human auditory perception.Existing defenses and perceptual metrics frequently fail to adequately capturethese auditory nuances, a limitation supported by our initial listening testsshowing low correlation between common metrics and human judgments. To bridgethis gap, we introduce Perceptually-Aligned MERT Transformer (PAMT), a novelframework for learning robust, perceptually-aligned music representations. |
Yuxuan Liu; Rui Sang; Peihong Zhang; Zhixin Li; Shengchen Li; | arxiv-cs.SD | 2025-09-05 |
| 177 | Learning and Composing of Classical Music Using Restricted Boltzmann Machines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we adopted J. S. Bach’s music for training ofa restricted Boltzmann machine (RBM). |
Mutsumi Kobayashi; Hiroshi Watanabe; | arxiv-cs.SD | 2025-09-05 |
| 178 | PianoBind: A Multimodal Joint Embedding Model for Pop-piano Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theselimitations, we propose PianoBind, a piano-specific multimodal joint embeddingmodel. |
Hayeon Bang; Eunjin Choi; Seungheon Doh; Juhan Nam; | arxiv-cs.SD | 2025-09-04 |
| 179 | Towards An AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thisapproach allows us to introduce a data synthesis framework that generatesverifiable sheet music questions in both textual and visual modalities, leadingto the Synthetic Sheet Music Reasoning Benchmark (SSMR-Bench) and acomplementary training set. |
ZHILIN WANG et. al. | arxiv-cs.CL | 2025-09-04 |
| 180 | Non-Asymptotic Performance Analysis of DOA Estimation Based on Real-Valued Root-MUSIC Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a systematic theoretical performance analysis of theReal-Valued root-MUSIC (RV-root-MUSIC) algorithm under non-asymptoticconditions. |
JUNYANG LIU et. al. | arxiv-cs.PF | 2025-09-02 |
| 181 | CoComposer: LLM Multi-agent Collaborative Music Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce CoComposer, a multi-agentsystem that consists of five collaborating agents, each with a task based onthe traditional music composition workflow. |
Peiwen Xing; Aske Plaat; Niki van Stein; | arxiv-cs.SD | 2025-08-29 |
| 182 | Amadeus: Autoregressive Model with Bidirectional Attribute Modelling for Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on thisinsight, we introduce Amadeus, a novel symbolic music generation framework.Amadeus adopts a two-level architecture: an autoregressive model for notesequences and a bidirectional discrete diffusion model for attributes. |
Hongju Su; Ke Li; Lan Yang; Honggang Zhang; Yi-Zhe Song; | arxiv-cs.SD | 2025-08-28 |
| 183 | CompLex: Music Theory Lexicon Constructed By Autonomous Agents for Automatic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel automatic music lexicon construction modelthat generates a lexicon, named CompLex, comprising 37,432 items derived fromjust 9 manually input category keywords and 5 sentence prompt templates. |
Zhejing Hu; Yan Liu; Gong Chen; Bruce X. B. Yu; | arxiv-cs.SD | 2025-08-27 |
| 184 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To compileMQAD, our methodology leverages specialized Music Information Retrieval (MIR)models to extract higher-level musical features and Large Language Models(LLMs) to generate natural language QA pairs. |
ZHIHAO OUYANG et. al. | arxiv-cs.SD | 2025-08-26 |
| 185 | A Survey on Evaluation Metrics for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we shed light on thisresearch gap, introducing a detailed taxonomy for evaluation metrics for bothaudio and symbolic music representations. |
Faria Binte Kader; Santu Karmaker; | arxiv-cs.SD | 2025-08-24 |
| 186 | From Sound to Sight: Towards AI-authored Music Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Conventional music visualisation systems rely on handcrafted ad hoctransformations of shapes and colours that offer only limited expressiveness.We propose two novel pipelines for automatically generating music videos fromany user-specified, vocal or instrumental song using off-the-shelf deeplearning models. |
LEO VITASOVIC et. al. | arxiv-cs.SD | 2025-08-20 |
| 187 | Exploring The Feasibility of LLMs for Automated Music Emotion Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we annotatedGiantMIDI-Piano, a classical MIDI piano music dataset, in a four-quadrantvalence-arousal framework using GPT-4o, and compared against annotationsprovided by three human experts. |
Meng Yang; Jon McCormack; Maria Teresa Llano; Wanchao Su; | arxiv-cs.SD | 2025-08-18 |
| 188 | Motive-level Analysis of Form-functions Association in Korean Folk Song Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a method forautomatic motive segmentation in Korean folk songs by fine-tuning a speechtranscription model on audio lyric with motif boundary annotation. |
Danbinaerin Han; Dasaem Jeong; Juhan Nam; | arxiv-cs.SD | 2025-08-14 |
| 189 | BeatFM: Improving Beat Tracking with Pre-trained Music Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Beat tracking is a widely researched topic in music information retrieval.However, current beat tracking methods face challenges due to the scarcity oflabeled data, which limits their ability to generalize across diverse musicalstyles and accurately capture complex rhythmic structures. To overcome thesechallenges, we propose a novel beat tracking paradigm BeatFM, which introducesa pre-trained music foundation model and leverages its rich semantic knowledgeto improve beat tracking performance. |
GANGHUI RU et. al. | arxiv-cs.SD | 2025-08-13 |
| 190 | Opening Musical Creativity? Embedded Ideologies in Generative-AI Music Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our aim is toinvestigate ideologies that are driving the early-stage development andadoption of generative-AI in music making, with a particular focus ondemocratization. |
Liam Pram; Fabio Morreale; | arxiv-cs.SD | 2025-08-12 |
| 191 | DAFMSVC: One-Shot Singing Voice Conversion with Dual Attention Mechanism and Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thesechallenges, we propose DAFMSVC, where the self-supervised learning (SSL)features from the source audio are replaced with the most similar SSL featuresfrom the target audio to prevent timbre leakage. |
WEI CHEN et. al. | arxiv-cs.SD | 2025-08-07 |
| 192 | Live Music Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new class of generative models for music called live musicmodels that produce a continuous stream of music in real-time with synchronizeduser control. |
LYRIA TEAM et. al. | arxiv-cs.SD | 2025-08-06 |
| 193 | Towards Hallucination-Free Music: A Reinforcement Learning Preference Optimization Framework for Reliable Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current supervised fine-tuning (SFT)approaches, limited by passive label-fitting, exhibit constrainedself-improvement and poor hallucination mitigation. To address this corechallenge, we propose a novel reinforcement learning (RL) framework leveragingpreference optimization for hallucination control. |
HUAICHENG ZHANG et. al. | arxiv-cs.SD | 2025-08-06 |
| 194 | Wearable Music2Emotion : Assessing Emotions Induced By AI-Generated Music Through Portable EEG-fNIRS Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theselimitations, we propose MEEtBrain, a portable and multimodal framework foremotion analysis (valence/arousal), integrating AI-generated music stimuli withsynchronized EEG-fNIRS acquisition via a wireless headband. |
SHA ZHAO et. al. | arxiv-cs.SD | 2025-08-05 |
| 195 | Via Score to Performance: Efficient Human-Controllable Long Song Generation with Bar-Level Symbolic Notation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Song generation is regarded as the most challenging problem in music AIGC;nonetheless, existing approaches have yet to fully overcome four persistentlimitations: controllability, generalizability, perceptual quality, andduration. We argue that these shortcomings stem primarily from the prevailingparadigm of attempting to learn music theory directly from raw audio, a taskthat remains prohibitively difficult for current models. |
Tongxi Wang; Yang Yu; Qing Wang; Junlang Qian; | arxiv-cs.SD | 2025-08-02 |
| 196 | Automatic Melody Reduction Via Shortest Path Finding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel and conceptuallysimple computational method for melody reduction using a graph-basedrepresentation inspired by principles from computational music theories, wherethe reduction process is formulated as finding the shortest path. |
Ziyu Wang; Yuxuan Wu; Roger B. Dannenberg; Gus Xia; | arxiv-cs.SD | 2025-08-02 |
| 197 | Advancing The Foundation Model for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we challenge thisparadigm by introducing a unified foundation model named MuFun for holisticmusic understanding. |
YI JIANG et. al. | arxiv-cs.SD | 2025-08-01 |
| 198 | A First Look at Generative Artificial Intelligence-Based Music Therapy for Mental Disorders IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Mental disorders show a rapid increase and cause considerable harm to individuals as well as the society in recent decade. Hence, mental disorders have become a serious public … |
LIN SHEN et. al. | IEEE Transactions on Consumer Electronics | 2025-08-01 |
| 199 | DeformTune: A Deformable XAI Music Prototype for Non-Musicians Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces DeformTune,a prototype system that combines a tactile deformable interface with theMeasureVAE model to explore more intuitive, embodied, and explainable AIinteraction. |
Ziqing Xu; Nick Bryan-Kinns; | arxiv-cs.HC | 2025-07-31 |
| 200 | Balancing Information Preservation and Disentanglement in Self-Supervised Music Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a multi-view SSL framework fordisentangling music audio representations that combines contrastive andreconstructive objectives. |
Julia Wilkins; Sivan Ding; Magdalena Fuentes; Juan Pablo Bello; | arxiv-cs.SD | 2025-07-30 |
| 201 | Controllable Video-to-Music Generation with Multiple Time-Varying Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thischallenge, we propose a novel multi-condition guided V2M generation frameworkthat incorporates multiple time-varying conditions for enhanced control overmusic generation. |
JUNXIAN WU et. al. | arxiv-cs.MM | 2025-07-28 |
| 202 | Music Arena: Live Evaluation for Text-to-Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Music Arena, an open platform for scalable human preferenceevaluation of text-to-music (TTM) models. |
YONGHYUN KIM et. al. | arxiv-cs.SD | 2025-07-28 |
| 203 | JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To the best of our knowledge, ourflow-matching-based JAM is the first effort toward endowing word-level timingand duration control in song generation, allowing fine-grained vocal control.To enhance the quality of generated songs to better align with humanpreferences, we implement aesthetic alignment through Direct PreferenceOptimization, which iteratively refines the model using a synthetic dataset,eliminating the need or manual data annotations. |
RENHANG LIU et. al. | arxiv-cs.SD | 2025-07-28 |
| 204 | Recommender Systems, Representativeness, and Online Music: A Psychosocial Analysis of Italian Listeners Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recommender systems shape music listening worldwide due to their widespreadadoption in online platforms. Growing concerns about representational harmsthat these systems may cause are nowadays part of the scientific and publicdebate, wherein music listener perspectives are oftentimes reported anddiscussed from a cognitive-behaviorism perspective, but rarely contextualisedunder a psychosocial and cultural lens. |
Lorenzo Porcaro; Chiara Monaldi; | arxiv-cs.HC | 2025-07-24 |
| 205 | Bob’s Confetti: Phonetic Memorization Attacks in Music and Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce AdversarialPhoneTic Prompting (APT), an attack that replaces iconic phrases withhomophonic alternatives–e.g., mom’s spaghetti becomes Bob’sconfetti–preserving the acoustic form while largely changing semanticcontent. |
JAECHUL ROH et. al. | arxiv-cs.SD | 2025-07-23 |
| 206 | Learning Sparsity for Effective and Efficient Music Performance Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing Music AVQA methods often rely on dense and unoptimized representations, leading to inefficiencies in the isolation of key information, the reduction of redundancy, and the prioritization of critical samples. To address these challenges, we introduce Sparsify, a sparse learning framework specifically designed for Music AVQA. |
XINGJIAN DIAO et. al. | acl | 2025-07-21 |
| 207 | Just Ask for Music (JAM): Multimodal and Personalized Natural Language Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present JAM (Just Ask for Music), alightweight and intuitive framework for natural language music recommendation.JAM models user-query-item interactions as vector translations in a sharedlatent space, inspired by knowledge graph embedding methods like TransE. |
ALESSANDRO B. MELCHIORRE et. al. | arxiv-cs.IR | 2025-07-21 |
| 208 | Toward Music-based Stress Management: Contemporary Biosensing Systems for Affective Regulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These systems — includinginteractive music applications, brain-computer interfaces, and biofeedbackdevices — aim to provide engaging, personalized experiences that improvetherapeutic outcomes. In this scoping and mapping review, we summarize andsynthesize systematic reviews and empirical research on biosensing systems withpotential applications in music-based affective regulation and stressmanagement, identify gaps in the literature, and highlight promising areas forfuture research. |
Natasha Yamane; Varun Mishra; Matthew S. Goodwin; | arxiv-cs.HC | 2025-07-21 |
| 209 | SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce SongComposer, a pioneering step towards a unified song composition model that can readily create symbolic lyrics and melodies following instructions. |
SHUANGRUI DING et. al. | acl | 2025-07-21 |
| 210 | Affect-aware Cross-Domain Recommendation for Art Therapy Via Music Preference Elicitation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Art Therapy (AT) is an established practice that facilitates emotional processing and recovery through creative expression. Recently, Visual Art Recommender Systems (VA RecSys) … |
B. A. Yilma; Luis A. Leiva; | ACM Conference on Recommender Systems | 2025-07-18 |
| 211 | Temporal Adaptation of Pre-trained Foundation Models for Music Structure Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a temporal adaptation approach forfine-tuning music foundation models tailored to MSA. |
Yixiao Zhang; Haonan Chen; Ju-Chiang Wang; Jitong Chen; | arxiv-cs.SD | 2025-07-17 |
| 212 | Large Language Models’ Internal Perception of Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) excel at modeling relationships between stringsin natural language and have shown promise in extending to other symbolicdomains like coding or mathematics. |
Andrew Shin; Kunitake Kaneko; | arxiv-cs.CL | 2025-07-17 |
| 213 | EditGen: Harnessing Cross-Attention Control for Instruction-Based Auto-Regressive Audio Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate leveraging cross-attention control forefficient audio editing within auto-regressive models. |
Vassilis Sioros; Alexandros Potamianos; Giorgos Paraskevopoulos; | arxiv-cs.SD | 2025-07-15 |
| 214 | Grammatical Structure and Grammatical Variations in Non-Metric Iranian Classical Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study we introduce a symbolic dataset composed of non-metric Iranianclassical music, and algorithms for structural parsing of this music, andgeneration of variations. |
Maziar Kanani; Sean O Leary; James McDermott; | arxiv-cs.NE | 2025-07-14 |
| 215 | Radif Corpus: A Symbolic Dataset for Non-Metric Iranian Classical Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce a digital corpus representing the completenon-metrical radif repertoire, covering all 13 existing components of thisrepertoire. |
Maziar Kanani; Sean O Leary; James McDermott; | arxiv-cs.SD | 2025-07-14 |
| 216 | MIDI-Zero: A MIDI-driven Self-Supervised Learning Approach for Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose MIDI-Zero, a novel self-supervisedlearning framework for CBMR that operates entirely on MIDI representations. |
Yuhang Su; Wei Hu; Hongfeng Gao; Fan Zhang; | sigir | 2025-07-13 |
| 217 | MusiScene: Leveraging MU-LLaMA for Scene Imagination and Enhanced Video Background Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, (1) we construct alarge-scale video-audio caption dataset with 3,371 pairs, (2) we finetune MusicUnderstanding LLaMA for the MSI task to create MusiScene, and (3) we conductcomprehensive evaluations and prove that our MusiScene is more capable ofgenerating contextually relevant captions compared to MU-LLaMA. |
Fathinah Izzati; Xinyue Li; Yuxuan Wu; Gus Xia; | arxiv-cs.AI | 2025-07-08 |
| 218 | EXPOTION: Facial Expression and Motion Control for Multimodal Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Expotion (Facial Expression and Motion Control for MultimodalMusic Generation), a generative model leveraging multimodal visual controls -specifically, human facial expressions and upper-body motion – as well as textprompts to produce expressive and temporally accurate music. |
Fathinah Izzati; Xinyue Li; Gus Xia; | arxiv-cs.SD | 2025-07-07 |
| 219 | Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Boomerang sampling, recently proposed forthe image domain, allows generating output close to an existing example, usingany pretrained diffusion model. In this work, we explore its application in theaudio domain as a tool for data augmentation or content manipulation.Specifically, implementing Boomerang sampling for Stable Audio Open, we augmenttraining data for a state-of-the-art beat tracker, and attempt to replacemusical instruments in recordings. |
Alexander Fichtinger; Jan Schlüter; Gerhard Widmer; | arxiv-cs.SD | 2025-07-07 |
| 220 | Evaluating Fake Music Detection Performance Under Audio Augmentations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As aresponse, models for detecting fake music have been proposed. In this work, weexplore the robustness of such systems under audio augmentations. |
Tomasz Sroka; Tomasz Wężowicz; Dominik Sidorczuk; Mateusz Modrzejewski; | arxiv-cs.SD | 2025-07-07 |
| 221 | A Low-Complexity Geometric PEACH Approach for 3D Localization Under Elevated Planar Arrays Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate and efficient user localization is critical for Integrated Sensing and Communication (ISAC) systems, particularly in urban environments with elevated antenna arrays. This … |
B. F. Costa; Taufik Abrão; | 2025 IEEE International Mediterranean Conference on … | 2025-07-07 |
| 222 | OMAR-RQ: Open Music Audio Representation Model Trained with Multi-Feature Masked Token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present OMAR-RQ, a modeltrained with self-supervision via masked token classification methodologiesusing a large-scale dataset with over 330,000 hours of music audio. |
Pablo Alonso-Jiménez; Pedro Ramoneda; R. Oguz Araz; Andrea Poltronieri; Dmitry Bogdanov; | arxiv-cs.SD | 2025-07-04 |
| 223 | “It’s More of A Vibe I’m Going For”: Designing Text-to-Music Generation Interfaces for Video Creators Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Background music plays a crucial role in social media videos, yet finding the right music remains a challenge for video creators. These creators, often not music experts, struggle … |
Noor Hammad; C. Fraser; Erik Harpstead; Jessica Hammer; Mira Dontcheva; | Proceedings of the 2025 ACM Designing Interactive Systems … | 2025-07-04 |
| 224 | MusGO: A Community-Driven Framework For Assessing Openness in Music-Generative AI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through this work,we aim to clarify the concept of openness in music-generative AI and promoteits transparent and responsible development. |
Roser Batlle-Roca; Laura Ibáñez-Martínez; Xavier Serra; Emilia Gómez; Martín Rocamora; | arxiv-cs.SD | 2025-07-04 |
| 225 | Content Filtering Methods for Music Recommendation: A Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Dueto this sparsity, there are several challenges that have to be addressed withother methods. This review examines the current state of research in addressingthese challenges, with an emphasis on the role of content filtering inmitigating biases inherent in collaborative filtering approaches. |
Terence Zeng; Abhishek K. Umrawal; | arxiv-cs.IR | 2025-07-02 |
| 226 | User-guided Generative Source Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To addressthis, we propose GuideSep, a diffusion-based MSS model capable ofinstrument-agnostic separation beyond the four-stem setup. |
Yutong Wen; Minje Kim; Paris Smaragdis; | arxiv-cs.SD | 2025-07-01 |
| 227 | Gregorian Melody, Modality, and Memory: Segmenting Chant with Bayesian Nonparametrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The segmentation we find achieves state-of-the-art performance in modeclassification. |
Vojtěch Lanz; Jan Hajič jr; | arxiv-cs.CL | 2025-06-30 |
| 228 | The Florence Price Art Song Dataset and Piano Accompaniment Generator Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Florence B. Price was a composer in the early 20th century whose musicreflects her upbringing in the American South, her African heritage, and herWestern classical training. She … |
Tao-Tao He; Martin E. Malandro; Douglas Shadle; | arxiv-cs.SD | 2025-06-29 |
| 229 | TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Aside from considering hierarchies in the temporal structure ofmusic, this paper explores an even more important aspect: concept hierarchy,which involves generating music ideas, transforming them, and ultimatelyorganizing them–across musical time and space–into a complete composition. Tothis end, we introduce TOMI (Transforming and Organizing Music Ideas) as anovel approach in deep music generation and develop a TOMI-based model viainstruction-tuned foundation LLM. |
Qi He; Gus Xia; Ziyu Wang; | arxiv-cs.SD | 2025-06-29 |
| 230 | MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their ability to interpret music sheets remainsunderexplored. To bridge this gap, we introduce MusiXQA, the firstcomprehensive dataset for evaluating and advancing MLLMs in music sheetunderstanding. |
JIAN CHEN et. al. | arxiv-cs.CV | 2025-06-28 |
| 231 | A Hierarchical Deep Learning Approach for Minority Instrument Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents variousstrategies to integrate hierarchical structures into models and tests a newclass of models for hierarchical music prediction. |
Dylan Sechet; Francesca Bugiotti; Matthieu Kowalski; Edouard d’Hérouville; Filip Langiewicz; | arxiv-cs.SD | 2025-06-26 |
| 232 | SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose **SongGen**, a fully open-source, single-stage auto-regressive transformer designed for controllable song generation.To foster community engagement and future research, we will release our model weights, training code, annotated data, and preprocessing pipeline. |
ZIHAN LIU et. al. | icml | 2025-06-25 |
| 233 | MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MuseControlLite, a lightweight mechanism designed to fine-tune text-to-music generation models for precise conditioning using various time-varying musical attributes and reference audio signals. |
FANG-DUO TSAI et. al. | icml | 2025-06-25 |
| 234 | Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Developing generative models to create or conditionally create symbolic music presents unique challenges due to the combination of limited data availability and the need for high precision in note pitch. To address these challenges, we introduce an efficient Fine-Grained Guidance (FGG) approach within diffusion models. |
Tingyu Zhu; Haoyu Liu; Ziyu Wang; Zhimin Jiang; Zeyu Zheng; | icml | 2025-06-25 |
| 235 | Large-Scale Training Data Attribution for Music Generative Models Via Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To validate the method, we perform a grid search over differenthyperparameter configurations and quantitatively evaluate the consistency ofthe unlearning approach. |
WOOSUNG CHOI et. al. | arxiv-cs.SD | 2025-06-23 |
| 236 | Let Your Video Listen to Your Music! — Beat-Aligned, Content-Preserving Video Editing with Arbitrary Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Aligning the rhythm of visual motion in a video with a given music track is a practical need in multimedia production, yet remains an underexplored task in autonomous video … |
Xinyu Zhang; Dong Gong; Zicheng Duan; Anton van den Hengel; Lingqiao Liu; | Proceedings of the 33rd ACM International Conference on … | 2025-06-23 |
| 237 | DuetGen: Music Driven Two-Person Dance Generation Via Hierarchical Masked Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present DuetGen, a novel framework for generating interactive two-persondances from music. |
ANINDITA GHOSH et. al. | arxiv-cs.GR | 2025-06-23 |
| 238 | AI-Generated Song Detection Via Lyrics Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in practice, suchperfect lyrics are not available (only the audio is); this leaves a substantialgap in applicability in real-life use cases. In this work, we instead proposesolving this gap by transcribing songs using general automatic speechrecognition (ASR) models. |
Markus Frohmann; Elena V. Epure; Gabriel Meseguer-Brocal; Markus Schedl; Romain Hennequin; | arxiv-cs.SD | 2025-06-23 |
| 239 | Benchmarking Music Generation Models and Metrics Via Human Preference Studies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we generate 6ksongs using 12 state-of-the-art models and conduct a survey of 15k pairwiseaudio comparisons with 2.5k human participants to evaluate the correlationbetween human preferences and widely used metrics. |
Florian Grötschla; Ahmet Solak; Luca A. Lanzendörfer; Roger Wattenhofer; | arxiv-cs.LG | 2025-06-23 |
| 240 | AI Harmonizer: Expanding Vocal Expression with A Generative Neurosymbolic Music AI System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present our methods, explore potentialapplications in performance and composition, and discuss future directions forreal-time implementations. |
Lancelot Blanchard; Cameron Holt; Joseph A. Paradiso; | arxiv-cs.HC | 2025-06-22 |
| 241 | From Generality to Mastery: Composer-Style Symbolic Music Generation Via Large-Scale Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate how general musicknowledge learned from a broad corpus can enhance the mastery of specificcomposer styles, with a focus on piano piece generation. |
Mingyang Yao; Ke Chen; | arxiv-cs.SD | 2025-06-20 |
| 242 | Hallucination Level of Artificial Intelligence Whisperer: Case Speech Recognizing Pantterinousut Rap Song Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We will compare theFaster Whisperer algorithm and YouTube’s internal speech-to-text functionality.The reference truth will be Finnish rap lyrics, which the main author’s littlebrother, Mc Timo, has written. |
Ismo Horppu; Frederick Ayala; Erlin Gulbenkoglu; | arxiv-cs.LG | 2025-06-19 |
| 243 | VS-Singer: Vision-Guided Stereo Singing Voice Synthesis with Consistency Schrödinger Bridge Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To explore the potential advantages of utilizing spatial cues from images forgenerating stereo singing voices with room reverberation, we introduceVS-Singer, a vision-guided model … |
ZIJING ZHAO et. al. | arxiv-cs.SD | 2025-06-19 |
| 244 | Exploiting Music Source Separation for Automatic Lyrics Transcription with Whisper Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For long-form, we propose an algorithm using source separation as a vocal activity detector to derive segment boundaries, which results in a consistent reduction in WER relative to Whisper’s native long-form algorithm. |
Jaza Syed; Ivan Meresman Higgs; Ondřej Cífka; Mark Sandler; | arxiv-cs.SD | 2025-06-18 |
| 245 | Versatile Symbolic Music-for-Music Modeling Via Function Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, wepropose parameter-efficient solutions for a variety of symbolic music-for-musictasks. |
Junyan Jiang; Daniel Chin; Liwei Lin; Xuanjie Liu; Gus Xia; | arxiv-cs.SD | 2025-06-18 |
| 246 | SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a multi-task music captioning model, SonicVerse, that integrates caption generation with auxiliary music feature detection tasks such as key detection, vocals detection, and more, so as to directly capture both low-level acoustic details as well as high-level musical attributes. |
Anuradha Chopra; Abhinaba Roy; Dorien Herremans; | arxiv-cs.SD | 2025-06-18 |
| 247 | Adaptive Accompaniment with ReaLchords Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose ReaLchords, an online generative model for improvising chord accompaniment to user melody. |
YUSONG WU et. al. | arxiv-cs.SD | 2025-06-17 |
| 248 | An Open Research Dataset of The 1932 Cairo Congress of Arab Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces ORD-CC32 , an open research dataset derived from the 1932 Cairo Congress of Arab Music recordings, a historically significant collection representing diverse Arab musical traditions. |
Baris Bozkurt; | arxiv-cs.SD | 2025-06-17 |
| 249 | SLEEPING-DISCO 9M: A Large-scale Pre-training Dataset for Generative Music Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Sleeping-DISCO 9M, a large-scale pre-training dataset for musicand song. |
Tawsif Ahmed; Andrej Radonjic; Gollam Rabby; | arxiv-cs.SD | 2025-06-17 |
| 250 | Do Music Preferences Reflect Cultural Values? A Cross-National Analysis Using Music Embedding and World Values Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the extent to which national music preferences reflect underlying cultural values. |
Yongjae Kim; Seongchan Park; | arxiv-cs.CL | 2025-06-16 |
| 251 | Persistent Homology of Music Network with Three Different Distances Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on applying persistent homology to music graph with predefined weights. |
Eunwoo Heo; Byeongchan Choi; Myung ock Kim; Mai Lan Tran; Jae-Hun Jung; | arxiv-cs.SD | 2025-06-16 |
| 252 | DanceChat: Large Language Model-Guided Music-to-Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce DanceChat, a Large LanguageModel (LLM)-guided music-to-dance generation approach. |
QING WANG et. al. | arxiv-cs.CV | 2025-06-12 |
| 253 | A Correlation-permutation Approach for Speech-music Encoders Model Merging Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Motivated by Git Re-Basin, we introduce a correlation-permutation approach that aligns a music encoder’s internal layers with a speech encoder. |
FABIAN RITTER-GUTIERREZ et. al. | arxiv-cs.SD | 2025-06-12 |
| 254 | Fine-Grained Control Over Music Generation with Activation Steering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a method for fine-grained control over music generation through inference-time interventions on an autoregressive generative music transformer called MusicGen. |
DIPANSHU PANDA et. al. | arxiv-cs.SD | 2025-06-11 |
| 255 | TuneGenie: Reasoning-based LLM Agents for Preferential Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recently, Large language models (LLMs) have shown great promise across adiversity of tasks, ranging from generating images to reasoning spatially.Considering their remarkable (and growing) textual reasoning capabilities, weinvestigate LLMs’ potency in conducting analyses of an individual’s preferencesin music (based on playlist metadata, personal write-ups, etc.) and producingeffective prompts (based on these analyses) to be passed to Suno AI (agenerative AI tool for music production). |
Amitesh Pandey; Jafarbek Arifdjanov; Ansh Tiwari; | arxiv-cs.SD | 2025-06-10 |
| 256 | AffectMachine-Pop: A Controllable Expert System for Real-time Pop Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present \textit{AffectMachine-Pop}, an expert system capable of generating retro-pop music according to arousal and valence values, which can either be pre-determined or based on a listener’s real-time emotion states. |
Kat R. Agres; Adyasha Dash; Phoebe Chua; Stefan K. Ehrlich; | arxiv-cs.HC | 2025-06-09 |
| 257 | LeVo: High-Quality Song Generation with Multi-Preference Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches still struggle with the complexcomposition of songs and the scarcity of high-quality data, leading tolimitations in audio quality, musicality, instruction following, andvocal-instrument harmony. To address these challenges, we introduce LeVo, alanguage model based framework consisting of LeLM and Music Codec. |
SHUN LEI et. al. | arxiv-cs.SD | 2025-06-09 |
| 258 | An Introduction to Pitch Strength in Contemporary Popular Music Analysis and Production Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music information retrieval distinguishes between low- and high-leveldescriptions of music. Current generative AI models rely on text descriptionsthat are higher level than the … |
Emmanuel Deruty; | arxiv-cs.SD | 2025-06-09 |
| 259 | Insights on Harmonic Tones from A Generative Music Experiment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: During a studio-lab experiment involving researchers, music producers, and an AI model for music generating bass-like audio, it was observed that the producers used the model’s output to convey two or more pitches with a single harmonic complex tone, which in turn revealed that the model had learned to generate structured and coherent simultaneous melodic lines using monophonic sequences of harmonic complex tones. |
Emmanuel Deruty; Maarten Grachten; | arxiv-cs.SD | 2025-06-08 |
| 260 | Methods for Pitch Analysis in Contemporary Popular Music: Vitalic’s Use of Tones That Do Not Operate on The Principle of Acoustic Resonance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study considers tones that evoke two or more simultaneous pitches and examines various inharmonic partial layouts. |
Emmanuel Deruty; Pascal Arbez-Nicolas; David Meredith; | arxiv-cs.SD | 2025-06-08 |
| 261 | FilmComposer: LLM-Driven Music Production for Silent Film Clips Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we implement music production for silent film clips using LLM-driven method. |
Zhifeng Xie; Qile He; Youjia Zhu; Qiwei He; Mengtian Li; | cvpr | 2025-06-07 |
| 262 | VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we systematically study music generation conditioned solely on the video. |
ZEYUE TIAN et. al. | cvpr | 2025-06-07 |
| 263 | HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces HarmonySet, a comprehensive dataset designed to advance video-music understanding. |
Zitang Zhou; Ke Mei; Yu Lu; Tianyi Wang; Fengyun Rao; | cvpr | 2025-06-07 |
| 264 | Enhancing Dance-to-Music Generation Via Negative Conditioning Latent Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on the problem of generating music synchronized with rhythmic visual cues of the given dance video. |
Changchang Sun; Gaowen Liu; Charles Fleming; Yan Yan; | cvpr | 2025-06-07 |
| 265 | Let’s Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, the absence of existing 3D singing datasets poses a considerable challenge. To address this, we collect a novel audiovisual dataset, ChorusHead which features synchronized mixed vocal audio and pseudo-3D flame motions for chorus singing. |
XIUMEI XIE et. al. | cvpr | 2025-06-07 |
| 266 | Exploring Listeners’ Perceptions of AI-generated and Human-composed Music for Functional Emotional Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work investigates how listeners perceive and evaluate AI-generated as compared to human-composed music in the context of emotional resonance and regulation. |
Kimaya Lecamwasam; Tishya Ray Chaudhuri; | arxiv-cs.HC | 2025-06-03 |
| 267 | MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing approaches exhibit critical limitations: motion graph methods rely on fixed template libraries, restricting creative generation; diffusion models, while capable of producing novel motions, often lack temporal coherence and musical alignment. To address these challenges, we propose $\textbf{MotionRAG-Diff}$, a hybrid framework that integrates Retrieval-Augmented Generation (RAG) with diffusion-based refinement to enable high-quality, musically coherent dance generation for arbitrary long-term music inputs. |
Mingyang Huang; Peng Zhang; Bang Zhang; | arxiv-cs.SD | 2025-06-03 |
| 268 | NoRe: Augmenting Journaling Experience with Generative AI for Music Creation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we explore how AI-generated music can augment the journaling experience. |
Joonyoung Park; Hyewon Cho; Hyehyun Chu; Yeeun Lee; Hajin Lim; | arxiv-cs.HC | 2025-06-02 |
| 269 | Ontological Modeling of Music and Musicological Claims. A Case Study in Early Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Emilio M. Sanfilippo; Richard Freedman; Alessandro Mosca; | Int. J. Digit. Libr. | 2025-06-01 |
| 270 | Iola Walker: A Mobile Footfall Detection System for Music Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Theobjective is to find a method for materially enhancing music using hardware andsoftware. |
William B. James; | arxiv-cs.MM | 2025-06-01 |
| 271 | An AI-driven Music Visualization System for Generating Meaningful Audio-Responsive Visuals in Real-Time Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music visualizations are visual representations or interpretations of music that often dynamically respond to audio. They have the potential to enhance the immersive and engaging … |
Jenny Huang; Christoph Johannes Weber; Sylvia Rothe; | Proceedings of the 2025 ACM International Conference on … | 2025-05-31 |
| 272 | Bridging The Gap Between Semantic and User Preference Spaces for Multi-modal Music Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Hierarchical Two-stage Contrastive Learning (HTCL) method that models similarity from the semantic perspective to the user perspective hierarchically to learn a comprehensive music representation bridging the gap between semantic and user preference spaces. |
XIAOFENG PAN et. al. | arxiv-cs.SD | 2025-05-29 |
| 273 | ACE-Step: A Step Towards Music Generation Foundation Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce ACE-Step, a novel open-source foundation model for music generation that overcomes key limitations of existing approaches and achieves state-of-the-art performance through a holistic architectural design. |
Junmin Gong; Sean Zhao; Sen Wang; Shengyuan Xu; Joe Guo; | arxiv-cs.SD | 2025-05-28 |
| 274 | MGPHot: A Dataset of Musicological Annotations for Popular Music (1958-2022) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Music Genome Project® is an extensive music annotation effort spanning two decades, during which a team of musicologists has been annotating a dataset of millions of songs … |
Sergio Oramas; Fabien Gouyon; Steve Hogan; Camilo Landau; Andreas Ehmann; | Trans. Int. Soc. Music. Inf. Retr. | 2025-05-28 |
| 275 | MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MelodySim, a melody-aware music similarity model and dataset for plagiarism detection. |
Tongyu Lu; Charlotta-Marlena Geist; Jan Melechovsky; Abhinaba Roy; Dorien Herremans; | arxiv-cs.SD | 2025-05-27 |
| 276 | Music Source Restoration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Music Source Restoration (MSR), a novel task addressing the gap between idealized source separation and real-world music production. |
Yongyi Zang; Zheqi Dai; Mark D. Plumbley; Qiuqiang Kong; | arxiv-cs.SD | 2025-05-27 |
| 277 | Semantic-Aware Interpretable Multimodal Music Auto-Tagging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present an interpretable framework for music auto-tagging that leverages groups of musically meaningful multimodal features, derived from signal processing, deep learning, ontology engineering, and natural language processing. |
Andreas Patakis; Vassilis Lyberatos; Spyridon Kantarelis; Edmund Dervakos; Giorgos Stamou; | arxiv-cs.LG | 2025-05-22 |
| 278 | Layer-wise Investigation of Large-Scale Self-Supervised Music Representation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we analyze the advanced music representation model MusicFM and the newly emerged SSL model MuQ. |
Yizhi Zhou; Haina Zhu; Hangting Chen; | arxiv-cs.SD | 2025-05-22 |
| 279 | Moonbeam: A MIDI Foundation Model Using Both Absolute and Relative Music Attributes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Leveraging the pretrained Moonbeam, we propose 2 finetuning architectures with full anticipatory capabilities, targeting 2 categories of downstream tasks: symbolic music understanding and conditional music generation (including music infilling). |
Zixun Guo; Simon Dixon; | arxiv-cs.SD | 2025-05-21 |
| 280 | Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a unified approach, where we train a general-purpose model on many translation tasks simultaneously. |
JONGMIN JUNG et. al. | arxiv-cs.SD | 2025-05-19 |
| 281 | Distilling A Speech and Music Encoder with Task Arithmetic Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Knowledge Distillation of teacher ensembles may be a natural solution, but we posit that decoupling the distillation of the speech and music SSL models allows for more flexibility. Thus, we propose to learn distilled task vectors and then linearly interpolate them to form a unified speech+music model. |
FABIAN RITTER-GUTIERREZ et. al. | arxiv-cs.SD | 2025-05-19 |
| 282 | Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present Text2midi-InferAlign, a novel technique for improving symbolic music generation at inference time. |
Abhinaba Roy; Geeta Puri; Dorien Herremans; | arxiv-cs.SD | 2025-05-18 |
| 283 | Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Particularly in pop music, where a single artist may use a variety of timbre’sand textures to achieve a desired quality, it can be difficult to identify whatvocal register within the vocal range a singer is using. This paper presentstwo methods for classifying vocal registers in an audio signal of male popmusic through the analysis of textural features of mel-spectrogram images.Additionally, we will discuss the practical integration of these models forvocal analysis tools, and introduce a concurrently developed software calledAVRA which stands for Automatic Vocal Register Analysis. |
Alexander Kim; Charlotte Botha; | arxiv-cs.SD | 2025-05-16 |
| 284 | Context-AI Tunes: Context-Aware AI-Generated Music for Stress Reduction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Choosing the right music can be challenging due to the overwhelming number of options and the time-consuming trial-and-error process. To address this, we propose Context-AI Tune (CAT), a system that generates personalized music based on environmental inputs and the user’s self-assessed stress level. |
Xiaoyan Wei; Zebang Zhang; Zijian Yue; Hsiang-Ting Chen; | arxiv-cs.HC | 2025-05-14 |
| 285 | A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thirdly, transformers typically require large amounts of labeled data to achieve optimal performances, but the SME task lacks of sufficient annotated data. To address these issues, in this paper, we propose a mamba-based network, called SpectMamba, for semi-supervised singing melody extraction using confidence binary regularization. |
XIAOLIANG HE et. al. | arxiv-cs.SD | 2025-05-13 |
| 286 | Predicting Music Track Popularity By Convolutional Neural Networks on Spotify Features and Spectrogram of Audio Waveform Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a pioneering methodology that uses Convolutional Neural Networks (CNNs) and Spotify data analysis to forecast the popularity of music tracks. |
Navid Falah; Behnam Yousefimehr; Mehdi Ghatee; | arxiv-cs.SD | 2025-05-12 |
| 287 | Harmonycloak: Making Music Unlearnable for Generative AI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent advances in generative AI have significantly expanded into the realms of art and music. This development has opened up a vast realm of possibilities, pushing the boundaries … |
Syed Irfan Ali Meerza; Lichao Sun; Jian Liu; | 2025 IEEE Symposium on Security and Privacy (SP) | 2025-05-12 |
| 288 | Not That Groove: Zero-Shot Symbolic Music Editing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Most work in AI music generation focused on audio, which has seen limited use in the music production industry due to its rigidity. To maximize flexibility while assuming only … |
Li Zhang; | arxiv-cs.SD | 2025-05-12 |
| 289 | CST: A Melody Generation Method Based on ChatGPT and Structure Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ruhan He; Ruixue Liu; Tao Peng; Xinrong Hu; | Multimedia Systems | 2025-05-11 |
| 290 | Learning Music Audio Representations With Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate the behavior of several music audio representation models under limited-data learning regimes. |
Christos Plachouras; Emmanouil Benetos; Johan Pauwels; | arxiv-cs.SD | 2025-05-09 |
| 291 | Automatic Music Transcription Using Convolutional Neural Networks and Constant-Q Transform Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we design a processing pipeline that can transform classical piano audio files in . |
Yohannis Telila; Tommaso Cucinotta; Davide Bacciu; | arxiv-cs.SD | 2025-05-07 |
| 292 | Detecting Spelling and Grammatical Anomalies in Russian Poetry Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a comprehensive comparison of unsupervised and supervised text anomaly detection approaches, utilizing both synthetic and human-labeled datasets. |
Ilya Koziev; | arxiv-cs.CL | 2025-05-07 |
| 293 | Flower Across Time and Media: Sentiment Analysis of Tang Song Poetry and Visual Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While previous scholarship has examined these domains independently, the systematic correlation between evolving literary emotions and visual culture remains underexplored. This study addresses that gap by employing BERT-based sentiment analysis to quantify emotional patterns in floral imagery across Tang Song poetry, then validating these patterns against contemporaneous developments in decorative arts.Our approach builds upon recent advances in computational humanities while remaining grounded in traditional sinological methods. |
Shuai Gong; Tiange Zhou; | arxiv-cs.CL | 2025-05-07 |
| 294 | Familiarizing with Music: Discovery Patterns for Different Music Discovery Needs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, very little is known about how users discover and explore previously unknown music, and how this behavior differs for users of varying discovery needs. In this paper we bridge this gap by analyzing data from a survey answered by users of the major music streaming platform Deezer in combination with their streaming data. |
Marta Moscati; Darius Afchar; Markus Schedl; Bruno Sguerra; | arxiv-cs.IR | 2025-05-06 |
| 295 | Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We represent symbolic music as image-like pianorolls, facilitating the use of diffusion models for the generation of symbolic music. |
Jincheng Zhang; György Fazekas; Charalampos Saitis; | arxiv-cs.SD | 2025-05-06 |
| 296 | Modeling Musical Genre Trajectories Through Pathlet Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We define a new framework that captures recurring patterns in genre trajectories, called pathlets, enabling the creation of comprehensible trajectory embeddings. |
Lilian Marey; Charlotte Laclau; Bruno Sguerra; Tiphaine Viard; Manuel Moussallam; | arxiv-cs.IR | 2025-05-06 |
| 297 | REFFLY: Melody-Constrained Lyrics Editing Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces REFFLY (REvision Framework For LYrics), the first revision framework for editing and generating melody-aligned lyrics. |
Songyan Zhao; Bingxuan Li; Yufei Tian; Nanyun Peng; | naacl | 2025-05-04 |
| 298 | A Data-Driven Method for Analyzing and Quantifying Lyrics-Dance Motion Relationships Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this challenge, we hypothesize that lyrics and dance motions that co-occur across multiple songs are related. Based on this hypothesis, we propose a novel data-driven method to detect the parts of songs where meaningful relationships between lyrics and dance motions exist. |
Kento Watanabe; Masataka Goto; | naacl | 2025-05-04 |
| 299 | Detecting Musical Deepfakes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study investigates the detection of AI-generated songs using the FakeMusicCaps dataset by classifying audio as either deepfake or human. |
Nick Sunday; | arxiv-cs.SD | 2025-05-03 |
| 300 | Exploring The Diversity of Music Experiences for Deaf and Hard of Hearing Individuals Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music plays an important role in the personal fulfillment and cognitive performance of deaf and hard of hearing (DHH) individuals. Since deafness is a spectrum — as are DHH … |
Kyrie Zhixuan Zhou; Weirui Peng; Yuhan Liu; Rachel F. Adler; | Proceedings of the ACM on Human-Computer Interaction | 2025-05-02 |
| 301 | AI-Empowered Consumer Behavior Modeling Framework for Music Recommendation Over Heterogeneous Electronics Products Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Amidst the fast progress of technology and the widespread availability of music streaming platforms, there is a pressing need to provide precise and reliable music recommendations … |
Ke Zhang; Amin Yousefpour; Daohua Pan; Jiajia Li; Guangwu Hu; | IEEE Transactions on Consumer Electronics | 2025-05-01 |
| 302 | Performance Analysis of A 2D-MUSIC Algorithm for Parametric Near-Field Channel Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this letter, we address parametric channel estimation in a multi-user multiple-input multiple-output system within the radiative near-field of the base station array with … |
DOĞA GÜRGÜNOĞLU et. al. | IEEE Wireless Communications Letters | 2025-05-01 |
| 303 | A Survey on Multimodal Music Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey provides a comprehensive overview of the current state-of-the-art in MMER. Discussing the different approaches and techniques used in this field, the paper introduces a four-stage MMER framework, including multimodal data selection, feature extraction, feature processing, and final emotion prediction. |
Rashini Liyanarachchi; Aditya Joshi; Erik Meijering; | arxiv-cs.MM | 2025-04-26 |
| 304 | MVPrompt: Building Music-Visual Prompts for AI Artists to Craft Music Video Mise-en-scène Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music videos have traditionally been the domain of experts, but with text-to-video generative AI models, AI artists can now create them more easily. However, accurately reflecting … |
ChungHa Lee; DaeHo Lee; Jin-Hyuk Hong; | Proceedings of the 2025 CHI Conference on Human Factors in … | 2025-04-25 |
| 305 | Exploring The Potential of Music Generative AI for Music-Making By Deaf and Hard of Hearing People Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent advancements in text-to-music generative AI (GenAI) have significantly expanded access to music creation. However, deaf and hard of hearing (DHH) individuals remain largely … |
Youjin Choi; JaeYoung Moon; JinYoung Yoo; Jin-Hyuk Hong; | Proceedings of the 2025 CHI Conference on Human Factors in … | 2025-04-25 |
| 306 | EuterPen: Unleashing Creative Expression in Music Score Writing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music notation programs force composers to follow the many rules of the staff notation when writing music and constantly seek to optimize symbol placement, making numerous … |
Vincent Cavez; Catherine Letondal; Caroline Appert; Emmanuel Pietriga; | Proceedings of the 2025 CHI Conference on Human Factors in … | 2025-04-25 |
| 307 | MV-Crafter: An Intelligent System for Music-guided Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, we present MV-Crafter, a system capable of producing high-quality music videos with synchronized music-video rhythm and style. |
CHUER CHEN et. al. | arxiv-cs.HC | 2025-04-24 |
| 308 | The Musical Mastermind: A Case Study on Fostering Music Theory and Computational Thinking Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: There is great potential in the introduction of gamified, interdisciplinary learning approaches in various education environments. This paper explores the effectiveness of Musical … |
Ioannis Sarlis; D. Kotsifakos; Christos Douligeris; | 2025 IEEE Global Engineering Education Conference (EDUCON) | 2025-04-22 |
| 309 | DRAGON: Distributional Rewards Optimize Diffusion Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Distributional RewArds for Generative OptimizatioN (DRAGON), a versatile framework for fine-tuning media generation models towards a desired outcome. |
Yatong Bai; Jonah Casebeer; Somayeh Sojoudi; Nicholas J. Bryan; | arxiv-cs.SD | 2025-04-21 |
| 310 | MusFlow: Multimodal Music Generation Via Conditional Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite advancements in generating music from specific textual descriptions (e.g., style, genre, instruments), the practical application is still hindered by ordinary users’ limited expertise or time to write accurate prompts. To bridge this application gap, this paper introduces MusFlow, a novel multimodal music generation model using Conditional Flow Matching. |
Jiahao Song; Yuzhao Wang; | arxiv-cs.SD | 2025-04-18 |
| 311 | Apollo: An Interactive Environment for Generating Symbolic Musical Phrases Using Corpus-based Style Imitation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Apollo, an interactive music application for generating symbolic phrases of conventional western music using corpus-based style imitation techniques. |
Renaud Bougueng Tchemeube; Jeff Ens; Philippe Pasquier; | arxiv-cs.HC | 2025-04-18 |
| 312 | A Survey on Cross-Modal Interaction Between Music and Multimodal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey aims to provide a comprehensive review of multimodal tasks related to music, outlining how music contributes to multimodal learning and offering insights for researchers seeking to expand the boundaries of computational music. |
SIFEI LI et. al. | arxiv-cs.MM | 2025-04-17 |
| 313 | Effects of Structural Reflection-Promoting Mechanism-Based Peer Assessment on Students’ Vocal Music Learning Performance and Perceptions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Vocal music education is a skill‐oriented course. Students not only need to improve their skills through repeated practice, but also need to learn self‐reflection on their singing … |
CHEN-CHEN LIU et. al. | J. Comput. Assist. Learn. | 2025-04-16 |
| 314 | Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the significant progress in controllable music generation and editing, challenges remain in the quality and length of generated music due to the use of Mel-spectrogram representations and UNet-based model structures. To address these limitations, we propose a novel approach using a Diffusion Transformer (DiT) augmented with an additional control branch using ControlNet. |
S. Hou; | icassp | 2025-04-15 |
| 315 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces MQAD, a music QA dataset built on the Million Song Dataset (MSD), encompassing a rich array of musical features – including beat, chord, key, structure, instrument, and genre — across 270,000 tracks, featuring nearly 3 million diverse questions and captions. |
Z. OUYANG et. al. | icassp | 2025-04-15 |
| 316 | Exploring Acoustic Similarity in Emotional Speech and Music Via Self-Supervised Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we revisit the acoustic similarity between emotion speech and music, starting with an analysis of the layerwise behavior of SSL models for Speech Emotion Recognition (SER) and Music Emotion Recognition (MER). |
Y. Sun; Z. Zhao; K. Richmond; Y. Li; | icassp | 2025-04-15 |
| 317 | A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thirdly, transformers typically require large amounts of labeled data to achieve optimal performances, but the SME task lacks of sufficient annotated data. To address these issues, in this paper, we propose a mamba-based network, called SpectMamba, for semi-supervised singing melody extraction using confidence binary regularization. |
X. HE et. al. | icassp | 2025-04-15 |
| 318 | DCD-MUSIC: Deep-Learning-Aided Cascaded Differentiable MUSIC Algorithm for Near-Field Localization of Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces deep-learning-aided cascaded differentiable MUSIC (DCD-MUSIC) that augments MUSIC near-field localization with dedicated deep neural networks (DNNs), allowing it to operate reliably and interpretably. |
A. Gast; L. Le Magoarou; N. Shlezinger; | icassp | 2025-04-15 |
| 319 | CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Contrastive Long-form Language-Audio Pretraining (CoLLAP) to significantly extend the perception window for both the input audio (up to 5 minutes) and the language descriptions (exceeding 250 words), while enabling contrastive learning across modalities and temporal dynamics. |
J. WU et. al. | icassp | 2025-04-15 |
| 320 | A Singing Melody Extraction Network Via Self-Distillation and Multi-Level Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a singing melody extraction network consisting of five stacked multi-scale feature time-frequency aggregation (MF-TFA) modules. |
Y. HU et. al. | icassp | 2025-04-15 |
| 321 | SPSinger: Multi-Singer Singing Voice Synthesis with Short Reference Prompt Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To overcome the challenge of requiring long audio prompts during inference, we introduce the Latent Prompt Adaptation Model (LPAM), a Transformer-based module that derives timbre features from global embeddings. |
J. Zhao; C. Low; Y. Wang; | icassp | 2025-04-15 |
| 322 | Multimodal Fusion for EEG Emotion Recognition in Music with A Multi-Task Learning Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a novel EEG-based emotion recognition approach for music, employing a two-stage training framework that integrates emotion representations from music, lyrics, and EEG. |
S. Huang; Z. Jin; D. Li; J. Han; X. Tao; | icassp | 2025-04-15 |
| 323 | MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an automatic assessment task for TTM models to align with human perception. |
C. Liu; | icassp | 2025-04-15 |
| 324 | FUTGA-MIR: Enhancing Fine-grained and Temporally-aware Music Understanding with Music Information Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While some existing music LLMs have been augmented with temporally-aware music captions, music information retrieval (MIR) features conventionally do not exist in music caption datasets, thus neglected by music-LLMs. To bridge the gap between recent music LLMs and conventional music information retrieval tasks, we propose FUTGA-MIR (Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation with Music Information Retrieval) to enhance the existing music LLMs by augmenting them with MIR features and aligning with human feedback. |
J. Wu; | icassp | 2025-04-15 |
| 325 | Benchmarking Music Generation Models and Metrics Via Human Preference Studies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we generate 6k songs using 12 state-of-the-art models and conduct a survey of 15k pairwise audio comparisons with 2.5k human participants to evaluate the correlation between human preferences and widely used metrics. |
F. Grötschla; A. Solak; L. A. Lanzendörfer; R. Wattenhofer; | icassp | 2025-04-15 |
| 326 | Coarse-to-Fine Text-to-Music Latent Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DiscoDiff, a text-to-music generative model that utilizes two latent diffusion models to produce high-fidelity 44.1kHz music hierarchically. |
L. A. Lanzendörfer; T. Lu; N. Perraudin; D. Herremans; R. Wattenhofer; | icassp | 2025-04-15 |
| 327 | Generating Gezi Opera Scores with A Large Language Model and A High-Quality Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we collect pictures of the jianpu of the Chinese Gezi opera and manually construct a high-quality standard data set of Chinese Gezi opera scores, which can be read by music notation software. |
Z. Lei; K. Gu; P. Bai; X. Shi; | icassp | 2025-04-15 |
| 328 | Music Tagging with Classifier Group Chains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose music tagging with classifier chains that model the interplay of music tags. |
T. Hasumi; T. Komatsu; Y. Fujita; | icassp | 2025-04-15 |
| 329 | Classifying Music-Induced Emotion Using Multi-Modal Ensembles of EEG and Audio Feature Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present our submission to the EEG-Music Emotion Recognition Challenge at ICASSP 2025. |
P. PAUKNER et. al. | icassp | 2025-04-15 |
| 330 | HANet: A Harmonic Attention-Based Network for Singing Melody Extraction from Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Harmonic relationships have been shown to be crucial in this task, but most existing models based on Convolutional Neural Networks (CNNs) struggle to capture long-range harmonic dependencies. To address this, we propose a Harmonic Attention-based Network (HANet) for singing melody extraction from polyphonic music, which includes multiple sampling layers. |
S. Wang; X. Kong; H. Huang; K. Wang; Y. Hu; | icassp | 2025-04-15 |
| 331 | SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and An Open-Source Professional Testset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a high-fidelity singing voice conversion system. |
Y. Zhou; | icassp | 2025-04-15 |
| 332 | Subtractive Training for Music Stem Insertion Using Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Subtractive Training1, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. |
I. Villa-Renteria; | icassp | 2025-04-15 |
| 333 | Perceptual Noise-Masking with Music Through Deep Spectral Envelope Shaping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Indeed, a music signal can mask some of the noise’s frequency components due to the effect of simultaneous masking. In this article, we propose a neural network based on a psychoacoustic masking model, designed to enhance the music’s ability to mask ambient noise by reshaping its spectral envelope with predicted filter frequency responses. |
C. Berger; R. Badeau; S. Essid; | icassp | 2025-04-15 |
| 334 | SONIQUE: Video Background Music Generation Using Unpaired Audio-Visual Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present SONIQUE, a model for generating background music tailored to video content. |
L. Zhang; M. Fuentes; | icassp | 2025-04-15 |
| 335 | Investigation of Perceptual Music Similarity Focusing on Each Instrumental Part Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents an investigation of perceptual similarity between music tracks focusing on each individual instrumental part based on a large-scale listening test towards developing an instrumental-part-based music retrieval. |
Y. Hashizume; T. Toda; | icassp | 2025-04-15 |
| 336 | MusicLIME: Explainable Multimodal Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce MusicLIME, a model-agnostic feature importance explanation method designed for multimodal music models. |
T. Sotirou; V. Lyberatos; O. M. Mastromichalakis; G. Stamou; | icassp | 2025-04-15 |
| 337 | Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. |
T. Karchkhadze; M. R. Izadi; S. Dubnov; | icassp | 2025-04-15 |
| 338 | A Novel Compressive Compound Word Encoding and Independent Word Attention for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enable models to capture the dependencies between independent tokens in super tokens when using compound word encoding, we propose compressive compound word (CCP) encoding and independent word attention (IWA). |
L. Zhou; L. Yin; Y. Qian; | icassp | 2025-04-15 |
| 339 | MotionComposer: Enhancing Rhythmic Music Generation with Adaptive Retrieval Reference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present MotionComposer, a novel retrieval-augmented, easy-to-hard training approach designed to enhance rhythmic music generation. |
J. Wang; L. Liu; J. Wang; | icassp | 2025-04-15 |
| 340 | Learning Music Audio Representations With Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Understanding how these models behave in limited-data scenarios could be crucial for developing techniques to tackle them.In this work, we investigate the behavior of several music audio representation models under limited-data learning regimes. We consider music models with various architectures, training paradigms, and input durations, and train them on data collections ranging from 5 to 8,000 minutes long. |
C. Plachouras; E. Benetos; J. Pauwels; | icassp | 2025-04-15 |
| 341 | EEG-Music Emotion Recognition: Challenge Overview Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose two tracks: (1) Person Identification aims to identify the subject from whom the EEG was recorded, while (2) Emotion Recognition targets the decoding of emotional state of the subject while listening to a musical stimulus. |
S. CALCAGNO et. al. | icassp | 2025-04-15 |
| 342 | DOSE: Drum One-Shot Extraction from Music Mixture Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces Drum One-Shot Extraction, a task in which the goal is to extract drum one-shots that are present in the music mixture. |
S. Hwang; S. Kang; K. Kim; S. Ahn; K. Lee; | icassp | 2025-04-15 |
| 343 | Investigating Factors Related to The Naturalness of Synthesized Unison Singing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on unison singing, which is to have several singers singing the same melody together. |
K. Nishizawa; R. Yamamoto; W. -C. Huang; T. Toda; | icassp | 2025-04-15 |
| 344 | Bootstrapping Language-Audio Pre-training for Music Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce BLAP, a model capable of generating high-quality captions for music. |
L. A. Lanzendörfer; C. Pinkl; N. Perraudin; R. Wattenhofer; | icassp | 2025-04-15 |
| 345 | Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel method based on dual diffusion bridges, trained using the CocoChorales Dataset, which consists of unpaired monophonic single-instrument audio data. |
M. Mancusi; | icassp | 2025-04-15 |
| 346 | Ultra Lightweight Singing Melody Extraction Via Combination of Convolution and MLP Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose the lightweight convolutional MLP (LcMLP), an ultra lightweight model without sacrificing the performance. |
J. Liu; K. Dong; Q. Huang; S. Yu; W. Li; | icassp | 2025-04-15 |
| 347 | Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Other approaches involve performing source separation before conversion, but this often introduces noticeable artifacts, leading to a significant drop in conversion quality and increasing the user’s operational costs. To address these issues, we introduce a novel SVC method that uses self-supervised representation-based melody features to improve melody modeling accuracy in the presence of BGM. |
W. CHEN et. al. | icassp | 2025-04-15 |
| 348 | Semi-Supervised Contrastive Learning for Controllable Video-to-Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, identifying the best music for a video can be a difficult and time-consuming task. To address this challenge, we propose a novel framework for automatically retrieving a matching music clip for a given video, and vice versa. |
S. Stewart; G. KV; L. Lu; A. Fanelli; | icassp | 2025-04-15 |
| 349 | MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, most research focuses on Western music, facing challenges in generating Chinese traditional melodies, particularly in capturing modal characteristics and emotional expression. To address this, we propose the Dual-Feature Modeling Module, which integrates the long-range modeling of the Mamba Block with the global structure capturing of the Transformer Block. |
J. CHEN et. al. | icassp | 2025-04-15 |
| 350 | PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such issues highlight the need for publicly available, copyright-free musical data, in which there is a large shortage, particularly for symbolic music data. To alleviate this issue, we present PDMX: a large-scale open-source dataset of over 250K public domain MusicXML scores collected from the score-sharing forum MuseScore, making it the largest available copyright-free symbolic music dataset to our knowledge. |
P. Long; Z. Novack; T. Berg-Kirkpatrick; J. McAuley; | icassp | 2025-04-15 |
| 351 | Naturalistic Music Decoding from EEG Data Via Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, we explore the potential of using latent diffusion models, a family of powerful generative models, for the task of reconstructing naturalistic music from electroencephalogram (EEG) recordings. |
E. Postolache; | icassp | 2025-04-15 |
| 352 | Diff4Steer: Steerable Diffusion Prior for Generative Music Retrieval with Semantic Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Modern music retrieval systems often rely on fixed representations of user preferences, limiting their ability to capture users’ diverse and uncertain retrieval needs. To address this limitation, we introduce Diff4Steer, a novel generative retrieval framework that employs lightweight diffusion models to synthesize diverse seed embeddings representing potential directions for music exploration. |
X. Bao; | icassp | 2025-04-15 |
| 353 | Melody Structure Transfer Network: Generating Music with Separable Self-Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an approach to transfer the structural characteristics of training samples for generating music. |
J. WU et. al. | icassp | 2025-04-15 |
| 354 | MusicGen-Stem: Multi-stem Music Generation and Edition Through Autoregressive Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To do so, we train one specialized compression algorithm per stem to tokenize the music into parallel streams of tokens. |
S. Rouard; R. S. Roman; Y. Adi; A. Roebel; | icassp | 2025-04-15 |
| 355 | F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose F-StrIPE, a structure-informed PE scheme that works in linear complexity. |
M. Agarwal; C. Wang; G. Richard; | icassp | 2025-04-15 |
| 356 | SteerMusic: Enhanced Musical Consistency for Zero-shot Text-Guided and Personalized Music Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose two music editing methods that enhance the consistency between the original and edited music by leveraging score distillation. |
XINLEI NIU et. al. | arxiv-cs.SD | 2025-04-14 |
| 357 | Progressive Rock Music Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conducted a comparative analysis of various machine learning techniques. |
Arpan Nagar; Joseph Bensabat; Jokent Gaza; Moinak Dey; | arxiv-cs.SD | 2025-04-14 |
| 358 | Compose with Me: Collaborative Music Inpainter for Symbolic Music Infilling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The field of music generation has seen a surge of interest from both academia and industry, with innovative platforms such as Suno, Udio, and SkyMusic earning widespread … |
Zhejing Hu; Yan Liu; Gong Chen; Bruce X. B. Yu; | AAAI Conference on Artificial Intelligence | 2025-04-11 |
| 359 | Extending Visual Dynamics for Video-to-Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose DyViM, a novel framework to enhance dynamics modeling for video-to-music generation. |
Xiaohao Liu; Teng Tu; Yunshan Ma; Tat-Seng Chua; | arxiv-cs.MM | 2025-04-10 |
| 360 | Optimality of Gradient-MUSIC for Spectral Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce the Gradient-MUSIC algorithm for estimating the unknownfrequencies and amplitudes of a nonharmonic signal from noisy time samples.While the classical MUSIC algorithm performs a computationally expensive searchover a fine grid, Gradient-MUSIC is significantly more efficient and eliminatesthe need for discretization over a fine grid by using optimization techniques.It coarsely scans the 1D landscape to find initialization simultaneously forall frequencies followed by parallelizable local refinement via gradientdescent. |
Albert Fannjiang; Weilin Li; Wenjing Liao; | arxiv-cs.IT | 2025-04-09 |
| 361 | Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a unified framework based on kernel methods to analyze both families of efficient PEs. |
Manvi Agarwal; Changhong Wang; Gael Richard; | arxiv-cs.SD | 2025-04-07 |
| 362 | Deconstructing Jazz Piano Style Using Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we focus on musical style, which benefits from a rich theoretical and mathematical analysis tradition. |
Huw Cheston; Reuben Bance; Peter M. C. Harrison; | arxiv-cs.SD | 2025-04-07 |
| 363 | Confidence-Enhanced Models for Indian Art Music Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine learning models for music have facilitated advancements in core applications like music pedagogy, singer identification, Rāga recognition, transcription, and genre … |
Sumit Kumar; Parampreet Singh; Vipul Arora; | 2025 IEEE International Conference on Acoustics, Speech, … | 2025-04-06 |
| 364 | Graphs Are Everywhere — Psst! In Music Recommendation Too Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the efficacy of Graph Convolutional Networks (GCN), GraphSAGE, and Graph Transformer (GT) models in learning embeddings that effectively capture intricate relationships between music items and genres represented within graph structures. |
Bharani Jayakumar; Orkun Özoğlu; | arxiv-cs.IR | 2025-04-03 |
| 365 | AE-AMT: Attribute-Enhanced Affective Music Generation With Compound Word Representation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Affective music generation is a challenge for symbolic music generation. Existing methods face the problem that the perceived emotion of the generated music is not evident because … |
Weiyi Yao; C. L. P. Chen; Zongyan Zhang; Tong Zhang; | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 366 | Two-Stage Spatial Whitening and Normalized MUSIC for Robust DOA Estimation of GNSS Signals Under Jamming Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The estimation of the direction of arrival (DOA) of navigation signals is a critical function of a global navigation satellite system (GNSS) array receiver for applications such … |
Chuanrui Wang; X. Cui; Gang Liu; Mingquan Lu; | IEEE Transactions on Aerospace and Electronic Systems | 2025-04-01 |
| 367 | A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The review covers modality representation, multi-modal data alignment, and their utilization to guide music generation. |
SHUYU LI et. al. | arxiv-cs.SD | 2025-04-01 |
| 368 | Exploring The Impact of An LLM-Powered Teachable Agent on Learning Gains and Cognitive Load in Music Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study examines the impact of an LLM-powered teachable agent, grounded in the Learning by Teaching (LBT) pedagogy, on students’ music theory learning and cognitive load. |
Lingxi Jin; Baicheng Lin; Mengze Hong; Kun Zhang; Hyo-Jeong So; | arxiv-cs.HC | 2025-04-01 |
| 369 | Hybrid Learning Module-Based Transformer for Multitrack Music Generation With Music Theory Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, multitrack music generation has garnered significant attention in both academic and industrial spheres for its versatile utilization of various instruments in … |
Y. TIE et. al. | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 370 | Semantic Communication for VR Music Live Streaming With Rate Splitting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Virtual reality (VR) live streaming has established a remarkable transformation of music performances that facilitates a unique interaction between artists and their audiences … |
Jiaqi Zou; Lvxin Xu; Songlin Sun; | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 371 | Counterfactual Music Recommendation for Mitigating Popularity Bias Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recommendation systems aim to suggest tracks that users may enjoy. However, the accuracy of recommendation results is affected by popularity bias. Previous studies have … |
Jidong Yuan; Bingyu Gao; Xiaokang Wang; Haiyang Liu; Lingyin Zhang; | IEEE Transactions on Computational Social Systems | 2025-04-01 |
| 372 | Text2Tracks: Prompt-based Music Recommendation Via Generative Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose to address the task of prompt-based music recommendation as a generative retrieval task. |
ENRICO PALUMBO et. al. | arxiv-cs.IR | 2025-03-31 |
| 373 | Music Information Retrieval on Representative Mexican Folk Vocal Melodies Through MIDI Feature Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study analyzes representative Mexican folk vocal melodies using MIDI feature extraction, examining ambitus, pitch-class entropy, and interval distribution. |
Mario Alberto Vallejo Reyes; | arxiv-cs.SD | 2025-03-31 |
| 374 | Systematic CXL Memory Characterization and Performance Analysis at Scale IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Compute Express Link (CXL) has emerged as a pivotal interconnect for memory expansion. Despite its potential, the performance implications of CXL across devices, latency regimes, … |
JINSHU LIU et. al. | Proceedings of the 30th ACM International Conference on … | 2025-03-30 |
| 375 | CrossMuSim: A Cross-Modal Framework for Music Similarity Retrieval with LLM-Powered Text Description Sourcing and Mining Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To overcome the scarcity of high-quality text-music paired data, this paper introduces a dual-source data acquisition approach combining online scraping and LLM-based prompting, where carefully designed prompts leverage LLMs’ comprehensive music knowledge to generate contextually rich descriptions. |
TRISTAN TSOI et. al. | arxiv-cs.SD | 2025-03-29 |
| 376 | Teaching LLMs Music Theory with In-Context Learning and Chain-of-Thought Prompting: Pedagogical Strategies for Machines Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study evaluates the baseline capabilities of Large Language Models (LLMs) like ChatGPT, Claude, and Gemini to learn concepts in music theory through in-context learning and chain-of-thought prompting. |
Liam Pond; Ichiro Fujinaga; | arxiv-cs.SD | 2025-03-28 |
| 377 | Vision-to-Music Generation: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we systematically review the research progress in the field of vision-to-music generation. |
ZHAOKAI WANG et. al. | arxiv-cs.CV | 2025-03-27 |
| 378 | Tune It Up: Music Genre Transfer and Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we adapt and improve CycleGAN model to perform music style transfer on Jazz and Classic genres. |
Fidan Samet; Oguz Bakir; Adnan Fidan; | arxiv-cs.SD | 2025-03-27 |
| 379 | Emotion Detection and Music Recommendation System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As artificial intelligence becomes more and more ingrained in daily life, we present a novel system that uses deep learning for music recommendation and emotion-based detection. |
Swetha Kambham; Hubert Jhonson; Sai Prathap Reddy Kambham; | arxiv-cs.CV | 2025-03-26 |
| 380 | Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the conventional next-token prediction paradigm in AR models does not align with the human creative process in music composition, potentially compromising the musicality of generated samples. To overcome this limitation, we introduce MusiCoT, a novel chain-of-thought (CoT) prompting technique tailored for music generation. |
MAX W. Y. LAM et. al. | arxiv-cs.SD | 2025-03-25 |
| 381 | CCMusic: An Open and Diverse Database for Chinese Music Information Retrieval Research Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Data are crucial in various computer-related fields, including music information retrieval (MIR), an interdisciplinary area bridging computer science and music. This paper introduces CCMusic, an open and diverse database comprising multiple datasets specifically designed for tasks related to Chinese music, highlighting our focus on this culturally rich domain. |
MONAN ZHOU et. al. | arxiv-cs.IR | 2025-03-24 |
| 382 | Music Similarity Representation Learning Focusing on Individual Instruments with Source Separation and Human Preference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose three methods that effectively improve performance. |
Takehiro Imamura; Yuka Hashizume; Wen-Chin Huang; Tomoki Toda; | arxiv-cs.SD | 2025-03-24 |
| 383 | A DBO-Based Improved 2D-MUSIC Algorithm for Localization Using OFDM Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper considers a single input multiple output (SIMO) integrated sensing and communication (ISAC) system, where orthogonal frequency division multiplexing (OFDM) … |
Xinyi Hu; Lingxiang Li; Zhen Wang; Zhi Chen; | 2025 IEEE Wireless Communications and Networking Conference … | 2025-03-24 |
| 384 | Eurovision Song Contest: Can Juries Assess The Quality of Songs Objectively? Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Nikola Kadoić; N. Z. Hrustek; Maja Gligora Markovic; | Central Eur. J. Oper. Res. | 2025-03-20 |
| 385 | A Bird Song Detector for Improving Bird Identification Through Deep Learning: A Case Study from Doñana Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A key challenge in bird species identification is that many recordings either lack target species or contain overlapping vocalizations, complicating automatic identification. To address these problems, we developed a multi-stage pipeline for automatic bird vocalization identification in Do\~nana National Park (SW Spain), a wetland of high conservation concern. |
ALBA MÁRQUEZ-RODRÍGUEZ et. al. | arxiv-cs.SD | 2025-03-19 |
| 386 | Development and Evaluation of A Mixed Reality Music Visualization for A Live Performance Based on Music Information Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The present study explores the development and evaluation of a mixed reality music visualization for a live music performance. Real-time audio analysis and crossmodal … |
Matthias Erdmann; Markus von Berg; Jochen Steffens; | Frontiers Virtual Real. | 2025-03-19 |
| 387 | Musicolors: Bridging Sound and Visuals For Synesthetic Creative Musical Experience Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we developed musicolors, a web-based music visualization library available in real-time. |
ChungHa Lee; Jin-Hyuk Hong; | arxiv-cs.HC | 2025-03-18 |
| 388 | SINGER: Stochastic Network Graph Evolving Operator for High Dimensional PDEs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a novel framework, StochastIc Network Graph Evolving operatoR (SINGER), for learning the evolution operator of high-dimensional partial differential equations (PDEs). |
MINGQUAN FENG et. al. | iclr | 2025-03-17 |
| 389 | SONICS: Synthetic Or Not – Identifying Counterfeit Songs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, existing datasets lack music-lyrics diversity, long-duration songs, and open-access fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs (4,751 hours) with over 49k synthetic songs from popular platforms like Suno and Udio. |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; Shaikh Anowarul Fattah; | iclr | 2025-03-17 |
| 390 | Serenade: A Singing Style Conversion Framework Based On Audio Infilling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose Serenade, a novel framework for the singing style conversion (SSC)task. |
Lester Phillip Violeta; Wen-Chin Huang; Tomoki Toda; | arxiv-cs.SD | 2025-03-16 |
| 391 | Children’s Enlightenment Music Education Based on Digital Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper represents an empirical study that examines the impact of specific teaching methods on the effectiveness of preschool music education.The current study seeks to … |
Lixin Wang; | J. Comput. Assist. Learn. | 2025-03-16 |
| 392 | Cultivation of Innovative Ability of Vocal Music Education Based on Big Data Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the wide application of big data technology in education, exploring its role in vocal music education has become particularly important. The purpose of this study is to … |
Siming Lin; | Journal of Computational Methods in Sciences and Engineering | 2025-03-14 |
| 393 | Artificial Intelligence in Music Education: Exploring Applications, Benefits, and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research investigates the present scenario of developments in artificial intelligence in music education in general and in various methods used to improve pedagogical … |
Yuanyang Yue; Yunqi Jing; | 2025 14th International Conference on Educational and … | 2025-03-14 |
| 394 | Cross-Modal Learning for Music-to-Music-Video Description Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we focus on the MV description generation task and propose a comprehensive pipeline encompassing training data construction and multimodal model fine-tuning. |
ZHUOYUAN MAO et. al. | arxiv-cs.SD | 2025-03-14 |
| 395 | Impact of Mobile Technology-Integrated Dynamic Assessment on Students’ Music Rhythm Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Rhythm is the fundamental element of music and one of the indispensable aspects of training in music education, making it highly valued. However, due to the limitations of … |
C. Koong; Chih-Hung Chen; Yu‐Tzu Chen; Gwo-Haur Hwang; | J. Comput. Assist. Learn. | 2025-03-05 |
| 396 | Be The Beat: AI-Powered Boombox for Music Suggestion from Freestyle Dance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Dance has traditionally been guided by music throughout history and across cultures, yet the concept of dancing to create music is rarely explored. In this paper, we introduce Be … |
Ethan Chang; Zhixing Chen; Jb Labrune; Marcelo Coelho; | Proceedings of the Nineteenth International Conference on … | 2025-03-04 |
| 397 | What Sounds Dangerous? Establishing Correlations Of Musical Features and Perceived Safety in HRI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Ahstract- This study explores the potential of music driven sonification as an effective method for improving safety in humanrobot collaboration. Building on the rich expressive … |
Amit Rogel; Jack Hayley; Richard J. Savery; Gil Weinberg; | 2025 20th ACM/IEEE International Conference on Human-Robot … | 2025-03-04 |
| 398 | Augmenting Online Meetings with Context-Aware Real-time Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate the potential of generative artificial intelligence (GenAI) for real-time music generation to enrich online meetings. |
Haruki Suzawa; Ko Watanabe; Andreas Dengel; Shoya Ishimaru; | arxiv-cs.HC | 2025-03-03 |
| 399 | BGM2Pose: Active 3D Human Pose Estimation with Non-Stationary Sounds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose BGM2Pose, a non-invasive 3D human pose estimation method using arbitrary music (e.g., background music) as active sensing signals. |
YUTO SHIBATA et. al. | arxiv-cs.CV | 2025-03-01 |
| 400 | Hiding Speech in Music Files Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xiaohong Zhang; Shijun Xiang; Hongbin Huang; | J. Inf. Secur. Appl. | 2025-03-01 |
| 401 | DGFM: Full Body Dance Generation Driven By Music Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In music-driven dance motion generation, most existing methods use hand-crafted features and neglect that music foundation models have profoundly impacted cross-modal content generation. To bridge this gap, we propose a diffusion-based method that generates dance movements conditioned on text and music. |
Xinran Liu; Zhenhua Feng; Diptesh Kanojia; Wenwu Wang; | arxiv-cs.SD | 2025-02-27 |
| 402 | JEN-1 DreamStyler: Customized Musical Concept Learning Via Pivotal Parameters Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept. |
Boyu Chen; Peike Li; Yao Yao; Alex Wang; | aaai | 2025-02-25 |
| 403 | Detecting Music Performance Errors with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address (2), we present a novel data generation technique capable of creating large-scale synthetic music error datasets. |
BENJAMIN SHIUE-HAL CHOU et. al. | aaai | 2025-02-25 |
| 404 | Text2midi: Generating Symbolic Music from Captions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces text2midi, an end-to-end model to generate MIDI files from textual descriptions. |
KESHAV BHANDARI et. al. | aaai | 2025-02-25 |
| 405 | JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This departure from the typical workflows of professional composers hinders the ability to refine details in specific tracks. To address this gap, we propose JEN-1 Composer, a unified framework designed to efficiently model marginal, conditional, and joint distributions over multi-track music using a single model. |
Yao Yao; Peike Li; Boyu Chen; Alex Wang; | aaai | 2025-02-25 |
| 406 | NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce NotaGen, a symbolic music generation model aiming to explore the potential of producing high-quality classical sheet music. |
YASHAN WANG et. al. | arxiv-cs.SD | 2025-02-25 |
| 407 | SongSong: A Time Phonograph for Chinese SongCi Music from Thousand of Years Away Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce SongSong, the first music generation model capable of restoring Chinese SongCi to our knowledge. |
JILIANG HU et. al. | aaai | 2025-02-25 |
| 408 | GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we present General Video-to-Music Generation model (GVMGen), designed for generating high-related music to the video input. |
HEDA ZUO et. al. | aaai | 2025-02-25 |
| 409 | S²MILE: Semantic-and-Structure-Aware Music-Driven Lyric Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we bridge the structural and semantic gap between music and lyrics by proposing an end-to-end model for music-driven lyric generation. |
Mu You; Fang Zhang; Shuai Zhang; Linli Xu; | aaai | 2025-02-25 |
| 410 | SongEditor: Adapting Zero-Shot Song Generation Language Model As A Multi-Task Editor Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present SongEditor, the first song editing paradigm that introduces the editing capabilities into language-modeling song generation approaches, facilitating both segment-wise and track-wise modifications. |
CHENYU YANG et. al. | aaai | 2025-02-25 |
| 411 | UniMuMo: Unified Text, Music, and Motion Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. |
HAN YANG et. al. | aaai | 2025-02-25 |
| 412 | SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose SongGLM, a lyric-to-melody generation system that leverages 2D alignment encoding and multi-task pre-training based on the General Language Model (GLM) to guarantee the alignment and harmony between lyrics and melodies. |
JIAXING YU et. al. | aaai | 2025-02-25 |
| 413 | CSL-L2M: Controllable Song-Level Lyric-to-Melody Generation Based on Conditional Transformer with Fine-Grained Lyric and Musical Controls Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Due to the difficulty of learning strict yet weak correlations between lyrics and melodies, previous methods have suffered from weak controllability, low-quality and poorly structured generation. To address these challenges, we propose CSL-L2M, a controllable song-level lyric-to-melody generation method based on an in-attention Transformer decoder with fine-grained lyric and musical controls, which is able to generate full-song melodies matched with the given lyrics and user-specified musical attributes. |
Li Chai; Donglin Wang; | aaai | 2025-02-25 |
| 414 | Perceptual Noise-Masking with Music Through Deep Spectral Envelope Shaping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Indeed, a music signal can mask some of the noise’s frequency components due to the effect of simultaneous masking. In this article, we propose a neural network based on a psychoacoustic masking model, designed to enhance the music’s ability to mask ambient noise by reshaping its spectral envelope with predicted filter frequency responses. |
Clémentine Berger; Roland Badeau; Slim Essid; | arxiv-cs.SD | 2025-02-24 |
| 415 | The GigaMIDI Dataset with Features for Expressive Music Performance Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Distinguishing between non-expressive and expressive MIDI tracks is challenging, as MIDI files do not inherently make this distinction. To address this issue, we introduce a set of innovative heuristics for detecting expressive music performance. |
KEON JU MAVERICK LEE et. al. | arxiv-cs.SD | 2025-02-24 |
| 416 | Characterizations of Kadison–Singer Lattices and Lie Triple Derivations on Kadison–Singer Algebras Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Guangyu An; Tian Fang; Danni Zhao; | Periodica Mathematica Hungarica | 2025-02-24 |
| 417 | ComposeOn Academy: Transforming Melodic Ideas Into Complete Compositions Integrating Music Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing digital audio workstations and music production software often present high entry barriers for users lacking formal musical training. To address this, we introduce ComposeOn, a music theory-based tool designed for users with limited musical knowledge. |
Hongxi Pu; Futian Jiang; Zihao Chen; Xingyue Song; | arxiv-cs.HC | 2025-02-21 |
| 418 | Visual and Auditory Aesthetic Preferences Across Cultures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a large-scale cross-cultural study examining aesthetic preferences across five distinct modalities extensively explored in the literature: shape, curvature, colour, musical harmony and melody. |
HARIN LEE et. al. | arxiv-cs.MM | 2025-02-20 |
| 419 | Synthesizing Composite Hierarchical Structure from Symbolic Music Corpora Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to analyzemusic compositions holistically and at multiple granularities, we propose aunified, hierarchical meta-representation of musical structure called thestructural temporal graph (STG). |
ILANA SHAPIRO et. al. | arxiv-cs.AI | 2025-02-20 |
| 420 | Note-Level Singing Melody Transcription for Time-Aligned Musical Score Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider an extended version of the traditional note-level transcription task that recognizes onset, offset, and pitch, through including extraction of additional note value to generate a time-aligned score from an audio input. |
Leekyung Kim; Sungwook Jeon; Wan Heo; Jonghun Park; | arxiv-cs.SD | 2025-02-17 |
| 421 | NOTA: Multimodal Music Notation Understanding for Visual Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing general-domain visual language models still lack the ability of music notation understanding. Recognizing this gap, we propose NOTA, the first large-scale comprehensive multimodal music notation dataset. |
MINGNI TANG et. al. | arxiv-cs.CV | 2025-02-17 |
| 422 | F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose F-StrIPE, a structure-informed PE scheme that works in linear complexity. |
Manvi Agarwal; Changhong Wang; Gael Richard; | arxiv-cs.SD | 2025-02-14 |
| 423 | CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To advance future research, we release WikiMT-X, a benchmark comprising 1,000 triplets of sheet music, audio, and richly varied text descriptions. |
SHANGDA WU et. al. | arxiv-cs.SD | 2025-02-14 |
| 424 | Video Soundtrack Generation By Aligning Emotions and Temporal Boundaries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce EMSYNC, a video-based symbolic music generation model thataligns music with a video’s emotional content and temporal boundaries. |
Serkan Sulun; Paula Viana; Matthew E. P. Davies; | arxiv-cs.SD | 2025-02-14 |
| 425 | Music Style Transfer and Creation Method Based on Transfer Learning Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the growth of people’s demand for personalized music, how to use AI technology to achieve accurate understanding and creative transformation of music styles has become an … |
Shuiyi Chi; Hao Chen; | Journal of Computational Methods in Sciences and Engineering | 2025-02-14 |
| 426 | Engaging K-12 Students with Flow-Based Music Programming: An Experience Report on Its Impact on Teaching and Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music and computer science (CS) have profound historical and structural connections, with programming music offering a promising avenue for engaging children in CS through … |
ZIFENG LIU et. al. | Proceedings of the 56th ACM Technical Symposium on Computer … | 2025-02-12 |
| 427 | Methods for Pitch Analysis in Contemporary Popular Music: Highlighting Pitch Uncertainty in Primaal’s Commercial Works Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The ultimate goal of the study is to introduce a set of methods suited to the analysis of pitch in contemporary popular music. |
Emmanuel Deruty; Luc Leroy; Yann Macé; David Meredith; | arxiv-cs.SD | 2025-02-12 |
| 428 | Hookpad Aria: A Copilot for Songwriters Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Hookpad Aria, a generative AI system designed to assist musicians in writing Western pop songs. |
CHRIS DONAHUE et. al. | arxiv-cs.SD | 2025-02-12 |
| 429 | Are Expressions for Music Emotions The Same Across Cultures? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A key challenge in cross-cultural research on music emotion is biased stimulus selection and manual curation of taxonomies, predominantly relying on Western music and languages. To address this, we propose a balanced experimental design with nine online experiments in Brazil, the US, and South Korea, involving N=672 participants. |
Elif Celen; Pol van Rijn; Harin Lee; Nori Jacoby; | arxiv-cs.CL | 2025-02-12 |
| 430 | YNote: A Novel Music Notation for Fine-Tuning LLMs in Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These formats are difficult for both machines and humans to interpret due to their variability and intricate structure. To address these challenges, we introduce YNote, a simplified music notation system that uses only four characters to represent a note and its pitch. |
SHAO-CHIEN LU et. al. | arxiv-cs.SD | 2025-02-12 |
| 431 | Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present a study of the datasets and research papers for music generation and quantify the bias and under-representation of genres. |
ATHARVA MEHTA et. al. | arxiv-cs.SD | 2025-02-11 |
| 432 | JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce JamendoMaxCaps, a large-scale music-caption dataset featuring over 362,000 freely licensed instrumental tracks from the renowned Jamendo platform. |
Abhinaba Roy; Renhang Liu; Tongyu Lu; Dorien Herremans; | arxiv-cs.SD | 2025-02-11 |
| 433 | Learning Musical Representations for Music Performance Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, existing methods tend to answer questions regarding musical performances inaccurately. To bridge the above research gaps, (i) given the intricate multimodal interconnectivity inherent to music data, our primary backbone is designed to incorporate multimodal interactions within the context of music; (ii) to enable the model to learn music characteristics, we annotate and release rhythmic and music sources in the current music datasets; (iii) for time-aware audio-visual modeling, we align the model’s music predictions with the temporal dimension. |
XINGJIAN DIAO et. al. | arxiv-cs.CV | 2025-02-10 |
| 434 | Automatic Identification of Samples in Hip-Hop Music Via Multi-Loss Training and An Artificial Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we show that a convolutional neural network trained on an artificial dataset can identify real-world samples in commercial hip-hop music. We extract vocal, harmonic, and percussive elements from several databases of non-commercial music recordings using audio source separation, and train the model to fingerprint a subset of these elements in transformed versions of the original audio. |
Huw Cheston; Jan Van Balen; Simon Durand; | arxiv-cs.SD | 2025-02-10 |
| 435 | Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Other approaches involve performing source separation before conversion, but this often introduces noticeable artifacts, leading to a significant drop in conversion quality and increasing the user’s operational costs. To address these issues, we introduce a novel SVC method that uses self-supervised representation-based melody features to improve melody modeling accuracy in the presence of BGM. |
WEI CHEN et. al. | arxiv-cs.SD | 2025-02-07 |
| 436 | ImprovNet – Generating Controllable Musical Improvisations with Iterative Corruption Refinement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite deep learning’s remarkable advances in style transfer across various domains, generating controllable performance-level musical style transfer for complete symbolically … |
KESHAV BHANDARI et. al. | 2025 International Joint Conference on Neural Networks … | 2025-02-06 |
| 437 | ImprovNet — Generating Controllable Musical Improvisations with Iterative Corruption Refinement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents ImprovNet, a transformer-based architecture that generates expressive and controllable musical improvisations through a self-supervised corruption-refinement training strategy. |
KESHAV BHANDARI et. al. | arxiv-cs.SD | 2025-02-06 |
| 438 | Investigation of Perceptual Music Similarity Focusing on Each Instrumental Part Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents an investigation of perceptual similarity between music tracks focusing on each individual instrumental part based on a large-scale listening test towards developing an instrumental-part-based music retrieval. |
Yuka Hashizume; Tomoki Toda; | arxiv-cs.SD | 2025-02-04 |
| 439 | The Beatbots: A Musician-Informed Multi-Robot Percussion Quartet Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose design principles to guide the development of future robotic music systems and identify key robotic music affordances that our musician consultants considered particularly important for robotic music performance. |
Isabella Pu; Jeff Snyder; Naomi Ehrich Leonard; | arxiv-cs.RO | 2025-02-02 |
| 440 | Secure & Personalized Music-to-Video Generation Via CHARCHA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Music is a deeply personal experience and our aim is to enhance this with a fully-automated pipeline for personalized music video generation. |
Mehul Agarwal; Gauri Agarwal; Santiago Benoit; Andrew Lippman; Jean Oh; | arxiv-cs.AI | 2025-02-02 |
| 441 | Application and Research of Music Generation System Based on CVAE and Transformer-XL in Video Background Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the field of music generation using algorithms, processing time-series data has consistently been a complex task. To improve music generation with long sequences, … |
Jun Min; Zhiwei Gao; Lei Wang; | IEEE Transactions on Industrial Informatics | 2025-02-01 |
| 442 | Gamispotify: A Gamified Social Music Recommendation System Based on Users’ Personal Values Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this article, we have introduced Gamispotify. For the first time, in a social network-based environment, and by benefiting from gamification and crowdsourcing, Gamispotify … |
Mohammad Hajarian; Miguel Herrera Carrillo; Paloma Díaz; I. Aedo; | Multimedia Tools and Applications | 2025-02-01 |
| 443 | Music Dynamics Visualization for Music Practice and Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Eun Ji Park; | Multimedia Tools and Applications | 2025-01-31 |
| 444 | Every Image Listens, Every Image Dances: Music-Driven Image Animation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce MuseDance, an innovative end-to-end model that animates reference images using both music and text inputs. |
Zhikang Dong; Weituo Hao; Ju-Chiang Wang; Peng Zhang; Pawel Polak; | arxiv-cs.CV | 2025-01-30 |
| 445 | Linguistic Analysis of Sinhala YouTube Comments on Sinhala Music Videos: A Dataset Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The purpose of this study is to analyze the behavior of Sinhala comments on YouTube Sinhala song videos using social media comments as primary data sources. |
W. M. Yomal De Mel; Nisansa de Silva; | arxiv-cs.CL | 2025-01-28 |
| 446 | Exploring The Collaborative Co-Creation Process with AI: A Case Study in Novice Music Production Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, AI influenced group social dynamics and role division amonghuman creators. Based on these insights, we propose the Human-AI Co-CreationStage Model and the Human-AI Agency Model, offering new perspectives oncollaborative co-creation with AI. |
Yue Fu; Michele Newman; Lewis Going; Qiuzi Feng; Jin Ha Lee; | arxiv-cs.HC | 2025-01-25 |
| 447 | Exploring GPT’s Ability As A Judge in Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we use a systematic prompt engineering approach for LLMs to solve MIR problems. |
Kun Fang; Ziyu Wang; Gus Xia; Ichiro Fujinaga; | arxiv-cs.IR | 2025-01-22 |
| 448 | Chromagram Features Analysis for Learning-Based Query By Humming Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Query by Humming (QBH) system is a melody-based searching system that can retrieve the song without using the information of the title, the composer, or lyrics. To well … |
Kuan-Yu Chen; Jian-Jiun Ding; | 2025 International Conference on Electronics, Information, … | 2025-01-19 |
| 449 | MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an automatic assessment task for TTM models to align with human perception. |
CHENG LIU et. al. | arxiv-cs.SD | 2025-01-18 |
| 450 | Deep Learning for Music Genre Classification: A Case Study of Thai Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Classifying music genres plays an important role in music recommendation systems and information retrieval. Advanced deep learning model has provided promising results compared to … |
Pasin Sawaengsawangarom; Suparoek Phongoen; Papis Wongchaisuwat; | Proceedings of the 2025 9th International Conference on … | 2025-01-16 |
| 451 | XMusic: Towards A Generalized and Controllable Symbolic Music Generation Framework IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a generalized symbolic music generation framework, XMusic, which supports flexible prompts (i.e., images, videos, texts, tags, and humming) to generate emotionally controllable and high-quality symbolic music. |
Sida Tian; Can Zhang; Wei Yuan; Wei Tan; Wenjie Zhu; | arxiv-cs.SD | 2025-01-15 |
| 452 | Innovative Applications and Teaching Effectiveness Analysis of Interactive Mobile Technology in Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid advancement of mobile Internet technology, the application of interactive mobile technology in education has emerged as a significant area of research, particularly … |
Na Sun; Yingran Zang; | Int. J. Interact. Mob. Technol. | 2025-01-13 |
| 453 | Sanidha: A Studio Quality Multi-Modal Dataset for Carnatic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce ‘Sanidha’, the first open-source novel dataset for Carnatic music, offering studio-quality, multi-track recordings with minimal to no overlap or bleed. |
Venkatakrishnan Vaidyanathapuram Krishnan; Noel Alben; Anish Nair; Nathaniel Condit-Schultz; | arxiv-cs.SD | 2025-01-12 |
| 454 | Music Tagging with Classifier Group Chains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose music tagging with classifier chains that model the interplay of music tags. |
Takuya Hasumi; Tatsuya Komatsu; Yusuke Fujita; | arxiv-cs.SD | 2025-01-09 |
| 455 | Music and Art: A Study in Cross-modal Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose guidelines for using music to enhance the experience of viewing art, and we propose directions for future research. |
Paul Warren; Paul Mulholland; Naomi Barker; | arxiv-cs.HC | 2025-01-09 |
| 456 | Evaluating Interval-based Tokenization for Pitch Representation in Symbolic Music Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a general framework for building interval-based tokenizations. |
Dinh-Viet-Toan Le; Louis Bigo; Mikaela Keller; | arxiv-cs.IR | 2025-01-08 |
| 457 | MAJL: A Model-Agnostic Joint Learning Framework for Music Source Separation and Pitch Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods still face two critical challenges that limit the improvement of both tasks: the lack of labeled data and joint learning optimization. To address these challenges, we propose a Model-Agnostic Joint Learning (MAJL) framework for both tasks. |
Haojie Wei; Jun Yuan; Rui Zhang; Quanyu Dai; Yueguo Chen; | arxiv-cs.SD | 2025-01-07 |
| 458 | Multi-label Cross-lingual Automatic Music Genre Classification from Lyrics with Sentence BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a multi-label, cross-lingual genre classification system based on multilingual sentence embeddings generated by sBERT. |
Tiago Fernandes Tavares; Fabio José Ayres; | arxiv-cs.IR | 2025-01-07 |
| 459 | Application of Blockchain Technology in Digital Music Copyright Management: A Case Study of VNT Chain Platform Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents the design and development of a digital music copyright management system, built on the VNT Chain blockchain platform. The system aims to enhance copyright … |
Qilong Shi; Yan Zhou; | Frontiers Blockchain | 2025-01-07 |
| 460 | SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and An Open-Source Professional Testset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a high-fidelity singing voice conversion system. |
YIQUAN ZHOU et. al. | arxiv-cs.SD | 2025-01-06 |
| 461 | A System for Melodic Harmonization Using Schoenberg Regions, Giant Steps, and Church Modes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, I describe Harmonizer, a prototype system for melodic harmonization. |
Frederick Fernandes; | arxiv-cs.SD | 2025-01-05 |
| 462 | Can Impressions of Music Be Extracted from Thumbnail Images? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This type of information is underrepresented in existing music caption datasets due to the challenges associated with extracting it directly from music data. To address this issue, we propose a method for generating music caption data that incorporates non-musical aspects inferred from music thumbnail images, and validated the effectiveness of our approach through human evaluations. |
Takashi Harada; Takehiro Motomitsu; Katsuhiko Hayashi; Yusuke Sakai; Hidetaka Kamigaito; | arxiv-cs.CL | 2025-01-05 |
| 463 | MusicGen-Stem: Multi-stem Music Generation and Edition Through Autoregressive Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To do so, we train one specialized compression algorithm per stem to tokenize the music into parallel streams of tokens. |
Simon Rouard; Robin San Roman; Yossi Adi; Axel Roebel; | arxiv-cs.SD | 2025-01-03 |
| 464 | MMVA: Multimodal Matching Based on Valence and Arousal Across Images, Music, and Musical Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Multimodal Matching based on Valence and Arousal (MMVA), a tri-modal encoder framework designed to capture emotional content across images, music, and musical captions. |
Suhwan Choi; Kyu Won Kim; Myungjoo Kang; | arxiv-cs.SD | 2025-01-02 |
| 465 | MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a self-supervised music representation learning model for music understanding. |
HAINA ZHU et. al. | arxiv-cs.SD | 2025-01-02 |
| 466 | Multimodal Music Genre Classification of Sotho-Tswana Musical Videos Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music genre classification is a fundamental task in music information retrieval, aimed at discerning the categorical placement, or genre, of a given musical piece. Such … |
Osondu E. Oguike; Mpho Primus; | IEEE Access | 2025-01-01 |
| 467 | PIMG: Progressive Image-to-Music Generation With Contrastive Diffusion Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The goal of Image-to-Music Generation is to create pure music according to the given image. Unlike existing tasks such as text-to-image generation, there is no explicit connection … |
Mulin Chen; Yajie Wang; Xuelong Li; | IEEE Transactions on Multimedia | 2025-01-01 |
| 468 | Music Generation Using Deep Learning and Generative AI: A Systematic Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a systematic review of recent advances in music generation using deep learning techniques, categorizing the latest research in the field and identifying key … |
Rohan Mitra; Imran A. Zualkernan; | IEEE Access | 2025-01-01 |
| 469 | Many-to-Many Singing Performance Style Transfer on Pitch and Energy Contours Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing voice conversion (SVC) aims to convert the singer identity of a singing voice to that of another singer. However, most existing SVC systems only perform the conversion of … |
Yu-Teng Hsu; J. Wang; Jyh-Shing Roger Jang; | IEEE Signal Processing Letters | 2025-01-01 |
| 470 | MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
CHENG LIU et. al. | ArXiv | 2025-01-01 |
| 471 | Music Emotion Classification Based on Heterogeneous Graph Neural Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The classification of musical emotions is crucial for the indexing, structuring, searching, and recommending of tracks and albums across various music platforms. Consequently, the … |
Jingying Guo; Peng Wang; | IEEE Access | 2025-01-01 |
| 472 | The Integration of Artificial Intelligence and Ethnic Music Cultural Inheritance Under Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The traditional music education system faces numerous challenges in inheriting ethnic music culture. Especially in the modern educational environment, the protection and … |
Wenbo Chang; | Comput. Sci. Inf. Syst. | 2025-01-01 |
| 473 | An Exploration of Controllability in Symbolic Music Infilling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study uses a transformer model to enhance the controllability of generative symbolic music models, specifically related to the infilling task. We introduce a novel Symbolic … |
Rui Guo; Dorien Herremans; | IEEE Access | 2025-01-01 |
| 474 | Application of Big Data Analysis in Optimizing Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid development of big data technology, its application in the field of education, especially music education, has become a new research field. This study is dedicated … |
Shaowei Min; | Journal of Computational Methods in Sciences and Engineering | 2025-01-01 |
| 475 | Emotion-Based Music Recommendation System Integrating Facial Expression Recognition and Lyrics Sentiment Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Facial Expression Recognition (FER) has created widespread interest due to its potential uses in personalized technology and mental health, notably in systems that recommend music … |
V. S. G. S. P. Bottu; Krishnasamy Ragavan; | IEEE Access | 2025-01-01 |
| 476 | A CNN-Based Approach for Classical Music Recognition and Style Emotion Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recognition refers to the process of automatically recognizing and classifying the musical content in audio signals using computer technology and algorithms. Music … |
Yawen Shi; | IEEE Access | 2025-01-01 |
| 477 | Aggregating Contextual Information for Multi-Criteria Online Music Recommendations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper introduces CAMCMusic, a novel context-aware multi-criteria music recommendation system designed to address these limitations without relying on user-specific … |
Jieqi Liu; | IEEE Access | 2025-01-01 |
| 478 | Music for All: Exploring Multicultural Representations in Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
ATHARVA MEHTA et. al. | ArXiv | 2025-01-01 |
| 479 | Unrolled Creative Adversarial Network For Generating Novel Musical Pieces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, a classical system was employed alongside a new system to generate creative music. |
Pratik Nag; | arxiv-cs.SD | 2024-12-31 |
| 480 | Music Genre Classification: Ensemble Learning with Subcomponents-level Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The letter introduces a novel approach by combining ensemble learning with attention to sub-components, aiming to enhance the accuracy of identifying music genres. |
Yichen Liu; Abhijit Dasgupta; Qiwei He; | arxiv-cs.SD | 2024-12-20 |
| 481 | Tuning Music Education: AI-Powered Personalization in Learning Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the second case study we prototype adaptive piano method books that use Automatic Music Transcription to generate exercises at different skill levels while retaining a close connection to musical interests. |
Mayank Sanganeria; Rohan Gala; | arxiv-cs.SD | 2024-12-18 |
| 482 | Detecting Machine-Generated Music with Explainability – A Challenge and Early Benchmarks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Machine-generated music (MGM) has become a groundbreaking innovation with wide-ranging applications, such as music therapy, personalised editing, and creative inspiration within … |
Yupei Li; Qiyang Sun; Hanqian Li; Lucia Specia; Bjorn W. Schuller; | ArXiv | 2024-12-18 |
| 483 | Detecting Machine-Generated Music with Explainability — A Challenge and Early Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By providing a comprehensive comparison of benchmark results and their interpretability, we propose several directions to inspire future research to develop more robust and effective detection methods for MGM. |
Yupei Li; Qiyang Sun; Hanqian Li; Lucia Specia; Björn W. Schuller; | arxiv-cs.SD | 2024-12-17 |
| 484 | Leveraging User-Generated Metadata of Online Videos for Cover Song Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a multi-modal approach for cover song identification on online video platforms. |
Simon Hachmeier; Robert Jäschke; | arxiv-cs.MM | 2024-12-16 |
| 485 | A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we provide a novel dataset of user-generated metadata and conduct a benchmark and a robustness study using recent LLMs with in-context-learning (ICL). |
Simon Hachmeier; Robert Jäschke; | arxiv-cs.CL | 2024-12-16 |
| 486 | Diffusion Models for Automatic Music Mixing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music mixing is a process that involves fine-tuning the levels, dynamics, and frequency content of musical elements to ensure clarity and harmony in the final music production. In … |
Xinyang Wu; Andrew Horner; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 487 | Sparse Sounds: Exploring Low-Dimensionality in Music Generation Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We are the first to explore the intricacies of LLM compression techniques in the context of text-to-music generation, focusing on the MusicGen Transformer model. We implement and … |
Shu Wang; Shiwei Liu; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 488 | EME33: A Dataset of Classical Piano Performances Guided By Expressive Markings with Application in Music Rendering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Expressive performance in classical music plays a crucial role in shaping interpretations of musical pieces. However, existing datasets often provide limited attention to … |
Tzu-Ching Hung; Jingjing Tang; Kit Armstrong; Yi-Cheng Lin; Yi-Wen Liu; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 489 | Graph Neural Network Guided Music Mashup Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music mashups integrate elements from different songs to create surprising and engaging listening experiences. Typically, a mashup combines the vocal track of a base song with the … |
Xinyang Wu; Andrew Horner; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 490 | Challenging Fairness: A Comprehensive Exploration of Bias in LLM-Based Recommendations IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Model (LLM)-based recommendation systems provide more comprehensive recommendations than traditional systems by deeply analyzing content and user behavior. However, … |
Shahnewaz Karim Sakib; Anindya Bijoy Das; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 491 | Interpreting Graphic Notation with MusicLDM: An AI Improvisation of Cornelius Cardew’s Treatise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents a novel method for composing and improvising music inspired by Cornelius Cardew’s Treatise, using AI to bridge graphic notation and musical expression. |
Tornike Karchkhadze; Keren Shao; Shlomo Dubnov; | arxiv-cs.SD | 2024-12-12 |
| 492 | Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a novel method named Visuals Music Bridge (VMB). |
BAISEN WANG et. al. | arxiv-cs.CV | 2024-12-12 |
| 493 | The Emotional Bridge: Exploring The Association Between Color and Western and Chinese Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Two prominent hypotheses have been proposed to explain the intriguing connection between music and color. The Direct Link Hypothesis posits a direct correlation between the two … |
Kaihui Lin; Daixin Zhang; Rongrong Chen; | Proceedings of the 17th International Symposium on Visual … | 2024-12-11 |
| 494 | Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper we introduce the Frechet Music Distance (FMD), a novel evaluation metric for generative symbolic music models, inspired by the Frechet Inception Distance (FID) in computer vision and Frechet Audio Distance (FAD) in generative audio. |
Jan Retkowski; Jakub Stępniak; Mateusz Modrzejewski; | arxiv-cs.SD | 2024-12-10 |
| 495 | Advancing Music Therapy: Integrating Eastern Five-Element Music Theory and Western Techniques with AI in The Novel Five-Element Harmony System Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this article, we developed a music therapy system for the first time by applying the theory of the five elements in music therapy to practice. |
Yubo Zhou; Weizhen Bian; Kaitai Zhang; Xiaohan Gu; | arxiv-cs.HC | 2024-12-09 |
| 496 | MuMu-LLaMA: Multi-modal Music Understanding and Generation Via Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address this, we introduce a dataset with 167.69 hours of multi-modal data, including text, images, videos, and music annotations. Based on this dataset, we propose MuMu-LLaMA, a model that leverages pre-trained encoders for music, images, and videos. |
Shansong Liu; Atin Sakkeer Hussain; Qilong Wu; Chenshuo Sun; Ying Shan; | arxiv-cs.SD | 2024-12-09 |
| 497 | Converting Vocal Performances Into Sheet Music Leveraging Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Advanced natural language processing (NLP) models are increasingly applied in music composition and performance, particularly in generating vocal melodies and simulating singing … |
Jinjing Jiang; Nicole Anne Teo Huiying; Haibo Pen; Seng-Beng Ho; Zhaoxia Wang; | 2024 IEEE International Conference on Data Mining Workshops … | 2024-12-09 |
| 498 | VidMusician: Video-to-Music Generation with Semantic-Rhythmic Alignment Via Hierarchical Visual Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose VidMusician, a parameter-efficient video-to-music generation framework built upon text-to-music models. |
SIFEI LI et. al. | arxiv-cs.SD | 2024-12-09 |
| 499 | Jess+: Designing Embodied AI for Interactive Music-making Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we discuss the conceptualisation and design of embodied AI within an inclusive music-making project. |
Craig Vear; Johann Benerradi; | arxiv-cs.HC | 2024-12-09 |
| 500 | AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just Sounds Great! Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The rise of bedroom producers has democratized music creation, while challenging producers to objectively evaluate their work. To address this, we present AI TrackMate, an LLM-based music chatbot designed to provide constructive feedback on music productions. |
Yi-Lin Jiang; Chia-Ho Hsiung; Yen-Tung Yeh; Lu-Rong Chen; Bo-Yu Chen; | arxiv-cs.SD | 2024-12-09 |
| 501 | Source Separation & Automatic Transcription for Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Using spectrogram masking, deep neural networks, and the MuseScore API, we attempt to create an end-to-end pipeline that allows for an initial music audio mixture (e.g.. |
Bradford Derby; Lucas Dunker; Samarth Galchar; Shashank Jarmale; Akash Setti; | arxiv-cs.SD | 2024-12-09 |
| 502 | M6: Multi-generator, Multi-domain, Multi-lingual and Cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Detecting machine-generated music (MGMD) is, therefore, critical to safeguarding these domains, yet the field lacks comprehensive datasets to support meaningful progress. To address this gap, we introduce \textbf{M6}, a large-scale benchmark dataset tailored for MGMD research. |
Yupei Li; Hanqian Li; Lucia Specia; Björn W. Schuller; | arxiv-cs.SD | 2024-12-08 |
| 503 | Semi-Supervised Contrastive Learning for Controllable Video-to-Music Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, identifying the best music for a video can be a difficult and time-consuming task. To address this challenge, we propose a novel framework for automatically retrieving a matching music clip for a given video, and vice versa. |
Shanti Stewart; Gouthaman KV; Lie Lu; Andrea Fanelli; | arxiv-cs.MM | 2024-12-08 |
| 504 | Aligned Music Notation and Lyrics Transcription Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces and formalizes, for the first time, the Aligned Music Notation and Lyrics Transcription (AMNLT) challenge, which addresses the complete transcription of vocal scores by jointly considering music symbols, lyrics, and their synchronization. |
Eliseo Fuentes-Martínez; Antonio Ríos-Vila; Juan C. Martinez-Sevilla; David Rizo; Jorge Calvo-Zaragoza; | arxiv-cs.CV | 2024-12-05 |
| 505 | Missing Melodies: AI Music Generation and Its Nearly Complete Omission of The Global South Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conducted an extensive analysis of over one million hoursof audio datasets used in AI music generation research and manually reviewedmore than 200 papers from eleven prominent AI and music conferences andorganizations (AAAI, ACM, EUSIPCO, EURASIP, ICASSP, ICML, IJCAI, ISMIR,NeurIPS, NIME, SMC) to identify a critical gap in the fair representation andinclusion of the musical genres of the Global South in AI research. |
Atharva Mehta; Shivam Chauhan; Monojit Choudhury; | arxiv-cs.SD | 2024-12-05 |
| 506 | Exploring Transformer-Based Music Overpainting for Jazz Piano Variations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs. |
Eleanor Row; Ivan Shanin; György Fazekas; | arxiv-cs.SD | 2024-12-05 |
| 507 | Relationships Between Keywords and Strong Beats in Lyrical Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Artificial Intelligence (AI) song generation has emerged as a popular topic, yet the focus on exploring the latent correlations between specific lyrical and rhythmic features remains limited. In contrast, this pilot study particularly investigates the relationships between keywords and rhythmically stressed features such as strong beats in songs. |
Callie C. Liao; Duoduo Liao; Ellie L. Zhang; | arxiv-cs.SD | 2024-12-05 |
| 508 | SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We observe that the differences between singing and talking audios manifest in terms of frequency and amplitude. |
YAN LI et. al. | arxiv-cs.CV | 2024-12-04 |
| 509 | Generation of Photo Slideshow with Song Based on Closeness Between Concept of Lyrics and That of Images IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper proposes a method that allows users to easily convert a large number of still images into movies by displaying photos in sync with memories or favorite songs, which … |
Mei Hashimoto; Michiharu Niimi; | 2024 Asia Pacific Signal and Information Processing … | 2024-12-03 |
| 510 | Advancing Music Emotion Recognition: A Transformer Encoder-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music Emotion Recognition (MER) involves identifying the emotional content conveyed by music. This field is becoming increasingly significant due to its broad range of … |
Yangyuan Chen; Zhizhong Ma; Mingjing Wang; Mingzhe Liu; | Proceedings of the 6th ACM International Conference on … | 2024-12-03 |
| 511 | ArtStory Beats: Highlighting Interactions Between Visual Arts and Music with Storytelling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this article, we present the use-case of ArtStory Beats, a mobile application that aims to reflect the creative dialogue between visual arts and music, offering stories about … |
M. Vayanou; A. Katifori; A. Antoniou; G. Loumos; Yannis E. Ioannidis; | ACM Journal on Computing and Cultural Heritage | 2024-12-02 |
| 512 | An Investigation of The Effect of Smart Cockpit Layout on Distracted Driving Behavior Based on Real Road Experiments Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the context of vehicle intelligence, smart cockpits are widely used in modern vehicle design. However, with the popularity of smart cockpits, their impact on drivers’ driving … |
Lin Hu; Xinjiao Deng; Fang Wang; Xianhui Wu; | IEEE Transactions on Intelligent Transportation Systems | 2024-12-01 |
| 513 | Triple Factorization-Based SNLF Representation With Improved Momentum-Incorporated AGD: A Knowledge Transfer Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Symmetric, high-dimensional and sparse (SHiDS) networks usually contain rich knowledge regarding various patterns. To adequately extract useful information from SHiDS networks, a … |
Ming Li; Yan Song; Derui Ding; Ran Sun; | IEEE Transactions on Knowledge and Data Engineering | 2024-12-01 |
| 514 | MusicGen-Chord: Advancing Music Generation Through Chord Progressions and Interactive Web-UI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: MusicGen is a music generation language model (LM) that can be conditioned on textual descriptions and melodic features. We introduce MusicGen-Chord, which extends this capability by incorporating chord progression features. |
Jongmin Jung; Andreas Jansson; Dasaem Jeong; | arxiv-cs.SD | 2024-11-29 |
| 515 | Parameter-Efficient Transfer Learning for Music Foundation Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: More music foundation models are recently being released, promising a general, mostly task independent encoding of musical information. Common ways of adapting music foundation … |
Yiwei Ding; Alexander Lerch; | ArXiv | 2024-11-28 |
| 516 | Music2Fail: Transfer Music to Failed Recorder Style Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate another style transfer scenario called “failed-music style transfer”. |
CHON IN LEONG et. al. | arxiv-cs.SD | 2024-11-27 |
| 517 | Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper addresses the data scarcity problem by applying Generative Adversarial Networks (GANs) to synthesise realistic handwritten music sheets. |
Elona Shatri; Kalikidhar Palavala; George Fazekas; | arxiv-cs.CV | 2024-11-25 |
| 518 | Proceedings of The 6th International Workshop on Reading Music Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical … |
Jorge Calvo-Zaragoza; Alexander Pacha; Elona Shatri; | arxiv-cs.CV | 2024-11-24 |
| 519 | Stylus: Repurposing Stable Diffusion for Training-Free Music Style Transfer on Mel-Spectrograms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To improve fidelity,we introduce a phase-preserving reconstruction strategy that avoids artifactsfrom Griffin-Lim reconstruction, and we adopt classifier-free-guidance-inspiredcontrol for adjustable stylization and multi-style blending. |
HEEHWAN WANG et. al. | arxiv-cs.SD | 2024-11-24 |
| 520 | DAIRHuM: A Platform for Directly Aligning AI Representations with Human Musical Judgments Applied to Carnatic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a platform for exploring the Direct alignment between AI music model Representations and Human Musical judgments (DAIRHuM). |
Prashanth Thattai Ravikumar; | arxiv-cs.SD | 2024-11-22 |
| 521 | Analysis of Vocal Music Teaching in University Based on Artificial Intelligence Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the increasing development of electronic computer information technology, concepts such as intelligent networking and big data have been introduced one after another. In this … |
Jinshan Han; | Proceedings of the 2024 3rd International Conference on … | 2024-11-22 |
| 522 | Mode-conditioned Music Learning and Composition: A Spiking Neural Network Inspired By Neuroscience and Psychology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a spiking neural network inspired by brain mechanisms and psychological theories to represent musical modes and keys, ultimately generating musical pieces that incorporate tonality features. |
Qian Liang; Yi Zeng; Menghaoran Tang; | arxiv-cs.SD | 2024-11-22 |
| 523 | Generative AI for Music and Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this dissertation, I introduce the three main directions of my research centered around generative AI for music and audio: 1) multitrack music generation, 2) assistive music creation tools, and 3) multimodal learning for audio and music. |
Hao-Wen Dong; | arxiv-cs.SD | 2024-11-21 |
| 524 | Building Music with Lego Bricks and Raspberry Pi Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, a system to build music in an intuitive and accessible way, with Lego bricks, is presented. |
Ana M. Barbancho; Lorenzo J. Tardon; Isabel Barbancho; | arxiv-cs.HC | 2024-11-20 |
| 525 | Song Form-aware Full-Song Text-to-Lyrics Generation with Multi-Level Granularity Syllable Count Control Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Lyrics generation presents unique challenges, particularly in achievingprecise syllable control while adhering to song form structures such as versesand choruses. Conventional … |
Yunkee Chae; Eunsik Shin; Suntae Hwang; Seungryeol Paik; Kyogu Lee; | arxiv-cs.CL | 2024-11-20 |
| 526 | Improving Controllability and Editability for Pretrained Text-to-Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These challenges include providing sufficient control over the generated content and allowing for flexible, precise edits. This thesis tackles these issues by introducing a series of advancements that progressively build upon each other, enhancing the controllability and editability of text-to-music generation models. |
Yixiao Zhang; | arxiv-cs.SD | 2024-11-19 |
| 527 | Zero-Shot Crate Digging: DJ Tool Retrieval Using Speech Activity, Music Structure And CLAP Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work we demonstrate an approach to discovering DJ tools in personal music collections. |
Iroro Orife; | arxiv-cs.SD | 2024-11-18 |
| 528 | Do Captioning Metrics Reflect Music Semantic Alignment? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present cases where traditional metrics are vulnerable to syntactic changes, and show they do not correlate well with human judgments. By addressing these issues, we aim to emphasize the need for a critical reevaluation of how music captions are assessed. |
Jinwoo Lee; Kyogu Lee; | arxiv-cs.SD | 2024-11-18 |
| 529 | Examining Platformization in Cultural Production: A Comparative Computational Analysis of Hit Songs on TikTok and Spotify Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores how TikTok and Spotify, situated in different governance and user contexts, could influence digital music production and reception within each platform and between each other. |
Na Ta; Fang Jiao; Cong Lin; Cuihua Shen; | arxiv-cs.SI | 2024-11-17 |
| 530 | Language Models for Music Medicine Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose fine-tuning MusicGen, a music-generating transformer model, to create short musical clips that assist patients in transitioning from negative to desired emotional states. |
EMMANOUIL NIKOLAKAKIS et. al. | arxiv-cs.SD | 2024-11-13 |
| 531 | PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: AI-based music generation has made significant progress in recent years.However, generating symbolic music that is both long-structured and expressiveremains a significant challenge. In this paper, we propose PerceiverS(Segmentation and Scale), a novel architecture designed to address this issueby leveraging both Effective Segmentation and Multi-Scale attention mechanisms.Our approach enhances symbolic music generation by simultaneously learninglong-term structural dependencies and short-term expressive details. |
Yungang Yi; Weihua Li; Matthew Kuo; Quan Bai; | arxiv-cs.AI | 2024-11-12 |
| 532 | Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a data generation framework for rich music discovery dialogue using a large language model (LLM) and user intents, system actions, and musical attributes. |
SeungHeon Doh; Keunwoo Choi; Daeyong Kwon; Taesu Kim; Juhan Nam; | arxiv-cs.SD | 2024-11-11 |
| 533 | Timing and Dynamics of The Rosanna Shuffle Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this analysis, we examine the timing and dynamics of the original drum track, focusing on rhythmic variations such as swing factor, microtiming deviations, tempo drift, and the overall dynamics of the hi-hat pattern. |
Esa Räsänen; Niko Gullsten; Otto Pulkkinen; Tuomas Virtanen; | arxiv-cs.SD | 2024-11-11 |
| 534 | Generating Mixcode Popular Songs with Artificial Intelligence: Concepts, Plans, and Speculations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper discusses a proposed project integrating artificial intelligence and popular music, with the ultimate goal of creating a powerful tool for implementing music for social transformation, education, healthcare, and emotional well-being. |
Abhishek Kaushik; Kayla Rush; | arxiv-cs.IR | 2024-11-10 |
| 535 | Psychological Needs As Credible Song Signals: Testing Large Language Models to Annotate Lyrics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: —Our preliminary study presents a new perspective in music information retrieval by investigating how contemporary song-making and listening emulate our innate responses, similar … |
Eunsun Smith; Yinxuan Wang; Eric Matson; | Conference on Computer Science and Information Systems | 2024-11-09 |
| 536 | Evolutionary Music Synthesis: A Generative AI System with Interactive User Feedback Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this study, a music generation system that reflects user preferences by combining music generation AI and inter-active evolutionary computation is proposed. In recent years, … |
Keishi Ohya; Emmanuel Ayedoun; Masataka Tokumaru; | 2024 Joint 13th International Conference on Soft Computing … | 2024-11-09 |
| 537 | Harnessing High-Level Song Descriptors Towards Natural Language-Based Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we assess LMs effectiveness in recommending songs based on user natural language descriptions and items with descriptors like genres, moods, and listening contexts. |
Elena V. Epure; Gabriel Meseguer-Brocal; Darius Afchar; Romain Hennequin; | arxiv-cs.IR | 2024-11-08 |
| 538 | Long-Form Text-to-Music Generation with Adaptive Prompts: A Case Study in Tabletop Role-Playing Games Soundtracks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Babel Bardo, a system that uses Large Language Models (LLMs) to transform speech transcriptions into music descriptions for controlling a text-to-music model. |
Felipe Marra; Lucas N. Ferreira; | arxiv-cs.SD | 2024-11-06 |
| 539 | MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, to date, there has been no work that considers them jointly to explore the modality alignment within. To bridge this gap, we propose a novel framework, termed MoMu-Diffusion, for long-term and synchronous motion-music generation. |
FUMING YOU et. al. | arxiv-cs.SD | 2024-11-04 |
| 540 | PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While piano music has become a significant area of study in Music Information Retrieval (MIR), there is a notable lack of datasets for piano solo music with text labels. To address this gap, we present PIAST (PIano dataset with Audio, Symbolic, and Text), a piano music dataset. |
HAYEON BANG et. al. | arxiv-cs.SD | 2024-11-04 |
| 541 | Generative AI and EEG-based Music Personalization for Work Stress Reduction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The escalating prevalence of work-related stress has led to a notable decline in work performance and the mental well-being of the workforce. Studies suggest that personally … |
VARSHA WIJETHUNGE et. al. | IECON 2024 – 50th Annual Conference of the IEEE Industrial … | 2024-11-03 |
| 542 | Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With advancements in deep learning, previous research has focused on generating suitable accompaniments but often lacks precise alignment with the desired instrumentation and genre. To address this, we propose a straightforward method that enables control over the accompaniment through text prompts, allowing the generation of music that complements the vocals and aligns with the song instrumental and genre requirements. |
Quoc-Huy Trinh; Minh-Van Nguyen; Trong-Hieu Nguyen Mau; Khoa Tran; Thanh Do; | arxiv-cs.SD | 2024-11-03 |
| 543 | I’ve Heard This Before: Initial Results on Tiktok’s Impact On The Re-Popularization of Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we analyze how TikTok helps to revitalize older songs. |
Breno Matos; Francisco Galuppo; Rennan Cordeiro; Flavio Figueiredo; | arxiv-cs.SI | 2024-11-02 |
| 544 | Assessing The Impact of Sampling, Remixes, and Covers on Original Song Popularity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using Who Sampled data and Google Trends, we examine how the popularity of a borrowing song affects the original. |
Guilherme Soares S. dos Santos; Flavio Figueiredo; | arxiv-cs.SI | 2024-11-02 |
| 545 | Music Foundation Model As Generic Booster for Music Downstream Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce SoniDo, a music foundation model (MFM) designed to extract hierarchical features from target music samples. |
WEIHSIANG LIAO et. al. | arxiv-cs.SD | 2024-11-02 |
| 546 | MIRFLEX: Music Information Retrieval Feature Library for Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces an extendable modular system that compiles a range of music feature extraction models to aid music information retrieval research. |
Anuradha Chopra; Abhinaba Roy; Dorien Herremans; | arxiv-cs.SD | 2024-11-01 |
| 547 | Measure By Measure: Measure-Based Automatic Music Composition with Modern Staff Notation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper introduces a hierarchical framework for automatic composition of polyphonic music in Western modern staff notation. Central to our framework, a music score is … |
Yujia Yan; Zhiyao Duan; | Trans. Int. Soc. Music. Inf. Retr. | 2024-11-01 |
| 548 | The Role of Artificial Intelligence in Personalized Music Teaching Quality Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the increasing emphasis on personalized learning in the field of education, music teaching has gradually turned to personalized methods to meet the diversified needs of … |
Shiwei Zhao; | Journal of Computational Methods in Sciences and Engineering | 2024-11-01 |
| 549 | Machine Learning Framework for Audio-Based Content Evaluation Using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents a machine learning framework for assessing similarity between audio content and predicting sentiment score. |
Aris J. Aristorenas; | arxiv-cs.SD | 2024-10-31 |
| 550 | HKDSME: Heterogeneous Knowledge Distillation for Semi-supervised Singing Melody Extraction Using Harmonic Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To solve the two issues, in this paper, we propose a heterogeneous knowledge distillation framework for semi-supervised singing melody extraction using harmonic supervision, termed as HKDSME. |
Shuai Yu; Xiaoliang He; Ke Chen; Yi Yu; | mm | 2024-10-30 |
| 551 | CNN-LSTM Based Multimodal Models for Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this article, we tackle the dual challenges of efficiency and quality in music generation. We aim to create a model that produces high-quality music efficiently while keeping … |
Man Zhang; Dongning Liu; | 2024 IEEE International Symposium on Parallel and … | 2024-10-30 |
| 552 | Controllable Music Loops Generation with MIDI and Text Via Multi-Stage Cross Attention and Instrument-Aware Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they often inadequately produce the precise rendering of critical music loop attributes, including melody, rhythms, and instrumentation, which are essential for modern music loop production. To overcome this limitation, this paper proposed a Loops Transformer and a Multi-Stage Cross Attention mechanism that enable a cohesive integration of textual and MIDI input specifications. |
Guan-Yuan Chen; Von-Wun Soo; | mm | 2024-10-30 |
| 553 | MUSCAT: A Multimodal MUSic Collection for Automatic Transcription of Real Recordings and Image Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Nevertheless, while proven to outperform single-modality recognition rates, this approach has been exclusively validated under controlled scenarios—monotimbral and monophonic synthetic data—mainly due to a lack of collections with symbolic score-level annotations for both recordings and graphical sheets. To promote research on this topic, this work presents the Multimodal mUSic Collection for Automatic Transcription (MUSCAT) assortment of acoustic recordings, image sheets, and their score-level annotations in several notation formats. |
ALEJANDRO GALAN-CUENCA et. al. | mm | 2024-10-30 |
| 554 | Emotion-Guided Image to Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an emotion-guided image-to-music generation framework that leverages the Valence-Arousal (VA) emotional space to produce music that aligns with the emotional tone of a given image. |
Souraja Kundu; Saket Singh; Yuji Iwahori; | arxiv-cs.SD | 2024-10-29 |
| 555 | Semi-Supervised Self-Learning Enhanced Music Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To handle the noisy label issue, we propose a semi-supervised self-learning (SSSL) method, which can differentiate between samples with correct and incorrect labels in a self-learning manner, thus effectively utilizing the augmented segment-level data. |
Yifu Sun; Xulong Zhang; Monan Zhou; Wei Li; | arxiv-cs.SD | 2024-10-29 |
| 556 | ExpressiveSinger: Multilingual and Multi-Style Score-based Singing Voice Synthesis with Expressive Performance Control IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing Voice Synthesis (SVS) has significantly advanced with deep generative models, achieving high audio quality but still struggling with musicality, mainly due to the lack of … |
Shuqi Dai; Ming-Yu Liu; Rafael Valle; Siddharth Gururani; | ACM Multimedia | 2024-10-28 |
| 557 | Symbotunes: Unified Hub for Symbolic Music Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Therefore, directly comparing the methods or becoming acquainted with them may present challenges. To mitigate this issue we introduce Symbotunes, an open-source unified hub for symbolic music generative models. |
Paweł Skierś; Maksymilian Łazarski; Michał Kopeć; Mateusz Modrzejewski; | arxiv-cs.SD | 2024-10-27 |
| 558 | An Approach to Hummed-tune and Song Sequences Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper covers details about the pre-processed data from the original type (mp3) to usable form for training and inference. |
LOC BAO PHAM et. al. | arxiv-cs.SD | 2024-10-27 |
| 559 | MusicFlow: Cascaded Flow Matching for Text Guided Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MusicFlow, a cascaded text-to-music generation model based on flow matching. |
K R PRAJWAL et. al. | arxiv-cs.SD | 2024-10-27 |
| 560 | We Musicians Know How to Divide and Conquer: Exploring Multimodal Interactions To Improve Music Reading and Memorization for Blind and Low Vision Learners Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the potential of multimodal assistive technologies (MATs) to convey visual information, such as music notation, to blind or low-vision (BLV) individuals, we do not fully … |
Leon Lu; Chase Crispin; Audrey Girouard; | Proceedings of the 26th International ACM SIGACCESS … | 2024-10-27 |
| 561 | Arabic Music Classification and Generation Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The dataset used in this project consists of new and classical Egyptian music pieces composed by different composers. |
MOHAMED ELSHAARAWY et. al. | arxiv-cs.SD | 2024-10-25 |
| 562 | Melody Construction for Persian Lyrics Using LSTM Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The present paper investigated automatic melody construction for Persian lyrics as an input. |
Farshad Jafari; Farzad Didehvar; Amin Gheibi; | arxiv-cs.SD | 2024-10-23 |
| 563 | Striking A New Chord: Neural Networks in Music Information Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Specifically, we compare LSTM, Transformer, and GPT models against a widely-used markov model to predict a chord event following a sequence of chords. |
Farshad Jafari; Claire Arthur; | arxiv-cs.IT | 2024-10-23 |
| 564 | Audio-to-Score Conversion Model Based on Whisper Methodology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This thesis innovatively introduces the Orpheus’ Score, a custom notation system that converts music information into tokens, designs a custom vocabulary library, and trains a corresponding custom tokenizer. |
Hongyao Zhang; Bohang Sun; | arxiv-cs.SD | 2024-10-22 |
| 565 | Music102: An $D_{12}$-equivariant Transformer for Chord Progression Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present Music102, an advanced model aimed at enhancing chord progressionaccompaniment through a $D_{12}$-equivariant transformer. |
Weiliang Luo; | arxiv-cs.SD | 2024-10-22 |
| 566 | Musinger: Communication of Music Over A Distance with Wearable Haptic Display and Touch Sensitive Surface Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the integration of auditory and tactile experiences in musical haptics, focusing on enhancing sensory dimensions of music through touch. Addressing the gap in translating auditory signals to meaningful tactile feedback, our research introduces a novel method involving a touch-sensitive recorder and a wearable haptic display that captures musical interactions via force sensors and converts these into tactile sensations. |
MIGUEL ALTAMIRANO CABRERA et. al. | arxiv-cs.HC | 2024-10-21 |
| 567 | Music2P: A Multi-Modal AI-Driven Tool for Simplifying Album Cover Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our ultimate goal is to provide a tool that empowers musicians and producers, especially those with limited resources or expertise, to create compelling album covers. |
Joong Ho Choi; Geonyeong Choi; Ji Eun Han; Wonjin Yang; Zhi-Qi Cheng; | cikm | 2024-10-21 |
| 568 | OpenMU: Your Swiss Army Knife for Music Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music. |
MENGJIE ZHAO et. al. | arxiv-cs.SD | 2024-10-20 |
| 569 | ArchiTone: A LEGO-Inspired Gamified System for Visualized Music Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Informed by formative investigation and inspired by LEGO, we introduce ArchiTone, a gamified system that employs constructivism by visualizing music theory concepts as musical blocks and buildings for music education. |
JIAXING YU et. al. | arxiv-cs.HC | 2024-10-20 |
| 570 | Audio Processing Using Pattern Recognition for Music Genre Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research aims to contribute to improving music recommendation systems and content curation. |
Sivangi Chatterjee; Srishti Ganguly; Avik Bose; Hrithik Raj Prasad; Arijit Ghosal; | arxiv-cs.SD | 2024-10-19 |
| 571 | Music Therapy for Autism Spectrum Disorder: A Comprehensive Literature Review on Therapeutic Efficacy, Limitations, and AI Integration Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Autism Spectrum Disorder (ASD) is a neurological and developmental condition that presents considerable social, behavioral, and communicative challenges to those diagnosed with … |
Beatrice Low; Xindi Liu; Richard Z. Li; Elizabeth Ren; Jasmine X Zhang; | 2024 IEEE 15th Annual Ubiquitous Computing, Electronics & … | 2024-10-17 |
| 572 | CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: These limitations reduce their effectiveness in a global, multimodal music environment. To address these issues, we introduce CLaMP 2, a system compatible with 101 languages that supports both ABC notation (a text-based musical notation format) and MIDI (Musical Instrument Digital Interface) for music information retrieval. |
SHANGDA WU et. al. | arxiv-cs.SD | 2024-10-17 |
| 573 | MeloTrans: A Text to Symbolic Music Generation Model Following Human Composition Habit Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To archive this, we develop the POP909$\_$M dataset, the first to include labels for musical motifs and their variants, providing a basis for mimicking human compositional habits. Building on this, we propose MeloTrans, a text-to-music composition model that employs principles of motif development rules. |
YUTIAN WANG et. al. | arxiv-cs.SD | 2024-10-17 |
| 574 | MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Generating music that aligns with the visual content of a video has been a challenging task, as it requires a deep understanding of visual semantics and involves generating music … |
RUIQI LI et. al. | ArXiv | 2024-10-16 |
| 575 | Like It or Not: Exploring The Impact of (Dis)liked Background Music on Player Behavior and Experience Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Entertainment media, including video games, utilize background music (BGM) to enhance ambiance and gameplay, with a growing trend of players replacing in-game audio with personal … |
Marc Schubhan; Sridhar Karra; Maximilian Altmeyer; Antonio Krüger; | Proceedings of the ACM on Human-Computer Interaction | 2024-10-14 |
| 576 | Do We Need More Complex Representations for Structure? A Comparison of Note Duration Representation for Music Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we inquire if the off-the-shelf Music Transformer models perform just as well on structural similarity metrics using only unannotated MIDI information. |
Gabriel Souza; Flavio Figueiredo; Alexei Machado; Deborah Guimarães; | arxiv-cs.SD | 2024-10-14 |
| 577 | CrAIzy MIDI: AI-powered Wearable Musical Instrumental for Novice Player Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Playing music is a deeply fulfilling and universally cherished activity, yet the steep learning curve often discourages novice amateurs. Traditional music creation demands … |
Hongni Ye; Xiangrong Zhu; Yongbo Yang; | Adjunct Proceedings of the 37th Annual ACM Symposium on … | 2024-10-13 |
| 578 | Towards Music-Aware Virtual Assistants Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We propose a system for modifying spoken notifications in a manner that is sensitive to the music a user is listening to. Spoken notifications provide convenient access to rich … |
Alexander Wang; David Lindlbauer; Chris Donahue; | Proceedings of the 37th Annual ACM Symposium on User … | 2024-10-13 |
| 579 | M2M-Gen: A Multimodal Framework for Automated Background Music Generation in Japanese Manga Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces M2M Gen, a multi modal framework for generating background music tailored to Japanese manga. |
Megha Sharma; Muhammad Taimoor Haseeb; Gus Xia; Yoshimasa Tsuruoka; | arxiv-cs.SD | 2024-10-13 |
| 580 | Small Tunes Transformer: Exploring Macro & Micro-Level Hierarchies for Skeleton-Conditioned Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we delve into the multi-level structures within music from macro-level and micro-level hierarchies. |
Yishan Lv; Jing Luo; Boyuan Ju; Xinyu Yang; | arxiv-cs.SD | 2024-10-11 |
| 581 | Explainability in Music Recommender System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recommendation systems play a crucial role in our daily lives, influencing many of our significant and minor decisions. These systems also have become integral to the music … |
Shahrzad Shashaani; | Proceedings of the 18th ACM Conference on Recommender … | 2024-10-08 |
| 582 | LyricLure: Mining Catchy Hooks in Song Lyrics to Enhance Music Discovery and Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music Search encounters a significant challenge as users increasingly rely on catchy lines from lyrics to search for both new releases and other popular songs. Integrating lyrics … |
Siddharth Sharma; Akshay Shukla; Ajinkya Walimbe; Tarun Sharma; Joaquin Delgado; | Proceedings of the 18th ACM Conference on Recommender … | 2024-10-08 |
| 583 | Song Emotion Classification of Lyrics with Out-of-Domain Data Under Label Scarcity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We examine the novel usage of a large out-of-domain dataset as a creative solution to the challenge of training data scarcity in the emotional classification of song lyrics. |
Jonathan Sakunkoo; Annabella Sakunkoo; | arxiv-cs.CL | 2024-10-08 |
| 584 | MuRS 2024: 2nd Music Recommender Systems Workshop Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recommendation has been relevant to the Recommender Systems (RecSys) community since the early days. With the growth of music streaming platforms, algorithmic … |
Andres Ferraro; Lorenzo Porcaro; Peter Knees; Christine Bauer; | Proceedings of the 18th ACM Conference on Recommender … | 2024-10-08 |
| 585 | Thunder: A Design Process to Build Emotionally Engaging Music Visualizations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music’s profound emotional impact extends beyond traditional listening experiences, playing a critical role in shaping user engagement in entertainment contexts, including digital … |
Caio Nunes; Isabelle Reinbold; Mariana Castro; Ticianne Darin; | Proceedings of the XXIII Brazilian Symposium on Human … | 2024-10-07 |
| 586 | Art2Mus: Bridging Visual Arts and Music Through Cross-Modal Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, existing image-to-music models are limited to simple images, lacking the capability to generate music from complex digitized artworks. To address this gap, we introduce $\mathcal{A}\textit{rt2}\mathcal{M}\textit{us}$, a novel model designed to create music from digitized artworks or text inputs. |
Ivan Rinaldi; Nicola Fanelli; Giovanna Castellano; Gennaro Vessio; | arxiv-cs.MM | 2024-10-07 |
| 587 | UniMuMo: Unified Text, Music and Motion Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. To … |
HAN YANG et. al. | ArXiv | 2024-10-06 |
| 588 | Music Statistics: Uncertain Logistic Regression Models with Applications in Analyzing Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jue Lu; Lianlian Zhou; Wenxing Zeng; Anshui Li; | Fuzzy Optimization and Decision Making | 2024-10-04 |
| 589 | Enriching Music Descriptions with A Finetuned-LLM and Metadata for Text-to-Music Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, users also articulate a need to explore music that shares similarities with their favorite tracks or artists, such as \textit{I need a similar track to Superstition by Stevie Wonder}. To address these concerns, this paper proposes an improved Text-to-Music Retrieval model, denoted as TTMR++, which utilizes rich text descriptions generated with a finetuned large language model and metadata. |
SeungHeon Doh; Minhee Lee; Dasaem Jeong; Juhan Nam; | arxiv-cs.SD | 2024-10-04 |
| 590 | SoundSignature: What Type of Music Do You Like? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: . In this paper, we highlight the application’s innovative features and educational potential, and present findings from a pilot user study that evaluates its efficacy and usability. |
Brandon James Carone; Pablo Ripollés; | arxiv-cs.SD | 2024-10-04 |
| 591 | SONIQUE: Video Background Music Generation Using Unpaired Audio-Visual Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present SONIQUE, a model for generating background music tailored to video content. |
Liqian Zhang; Magdalena Fuentes; | arxiv-cs.SD | 2024-10-04 |
| 592 | CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Contrastive Long-form Language-Audio Pretraining (\textbf{CoLLAP}) to significantly extend the perception window for both the input audio (up to 5 minutes) and the language descriptions (exceeding 250 words), while enabling contrastive learning across modalities and temporal dynamics. |
JUNDA WU et. al. | arxiv-cs.SD | 2024-10-03 |
| 593 | Analyzing Byte-Pair Encoding on Monophonic and Polyphonic Symbolic Music: A Focus on Musical Phrase Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Given that symbolic music can differ significantly from text, particularly with polyphony, we investigate how BPE behaves with different types of musical content. This study provides a qualitative analysis of BPE’s behavior across various instrumentations and evaluates its impact on a musical phrase segmentation task for both monophonic and polyphonic music. |
Dinh-Viet-Toan Le; Louis Bigo; Mikaela Keller; | arxiv-cs.IR | 2024-10-02 |
| 594 | Agent-Driven Large Language Models for Mandarin Lyric Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this research, we developed a multi-agent system that decomposes the melody-to-lyric task into sub-tasks, with each agent controlling rhyme, syllable count, lyric-melody alignment, and consistency. |
Hong-Hsiang Liu; Yi-Wen Liu; | arxiv-cs.CL | 2024-10-02 |
| 595 | Generating Symbolic Music from Natural Language Prompts Using An LLM-Enhanced Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present MetaScore, a new dataset consisting of 963K musical scores paired with rich metadata, including free-form user-annotated tags, collected from an online music forum. |
Weihan Xu; Julian McAuley; Taylor Berg-Kirkpatrick; Shlomo Dubnov; Hao-Wen Dong; | arxiv-cs.SD | 2024-10-02 |
| 596 | Do Music Generation Models Encode Music Theory? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thus, we introduce SynTheory, a synthetic MIDI and audio music theory dataset, consisting of tempos, time signatures, notes, intervals, scales, chords, and chord progressions concepts. |
Megan Wei; Michael Freeman; Chris Donahue; Chen Sun; | arxiv-cs.SD | 2024-10-01 |
| 597 | An Adaptive Melody Search Algorithm Based on Low-level Heuristics for Material Feeding Scheduling Optimization in A Hybrid Kitting System Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yufan Huang; Ling-zhi Zhao; Binghai Zhou; | Adv. Eng. Informatics | 2024-10-01 |
| 598 | Melody-Guided Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present the Melody-Guided Music Generation (MG2) model, a novel approach using melody to guide the text-to-music generation that, despite a simple method and limited resources, achieves excellent performance. |
Shaopeng Wei; Manzhen Wei; Haoyu Wang; Yu Zhao; Gang Kou; | arxiv-cs.SD | 2024-09-30 |
| 599 | GENPIA: A Genre-Conditioned Piano Music Generation System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the demand for music continuing to grow as people seek variety and personal resonance, many works focus on music generation. In this study, we propose GENPIA, a … |
QUOC-VIET NGUYEN et. al. | 2024 IEEE 5th International Symposium on the Internet of … | 2024-09-30 |
| 600 | Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes integrating a text-to-music model with a large language model to generate music with form. |
Lilac Atassi; | arxiv-cs.SD | 2024-09-30 |
| 601 | Presence and Flow in Virtual and Mixed Realities for Music-Related Educational Settings Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music in Extended Reality (XR) is increasingly present in both academic and industrial research. While XR applications are more prevalent in STEM education, there is growing … |
Leonard Bruns; Benedict Saurbier; Tray Minh Voong; Michael Oehler; | 2024 IEEE 5th International Symposium on the Internet of … | 2024-09-30 |
| 602 | Characteristics and Development Trends of Internet Plus Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The ‘Internet Plus Music Education’ utilizes online technologies to transcend traditional educational constraints of time and space, offering students flexible and efficient … |
Xu Ni; Xiyang Chen; | Int. J. Web Based Learn. Teach. Technol. | 2024-09-26 |
| 603 | Tuning Into Bias: A Computational Study of Gender Bias in Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an analysis of gender bias in English song lyrics using topic modeling and bias measurement techniques. |
Danqing Chen; Adithi Satish; Rasul Khanbayov; Carolin M. Schuster; Georg Groh; | arxiv-cs.CL | 2024-09-24 |
| 604 | Transforming Music Education Through Artificial Intelligence: A Systematic Literature Review on Enhancing Music Teaching and Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The advent of artificial intelligence (AI) has brought significant and transformative alterations to traditional music education. This study examines the progress of AI technology … |
Yifang Zhang; Beh Wen Fen; Chao Zhang; Sheng Pi; | Int. J. Interact. Mob. Technol. | 2024-09-24 |
| 605 | SongTrans: An Unified Song Transcription and Alignment Method for Lyrics and Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While there are tools available for lyric or note transcription tasks, they all need pre-processed data which is relatively time-consuming (e.g., vocal and accompaniment separation). Besides, most of these tools are designed to address a single task and struggle with aligning lyrics and notes (i.e., identifying the corresponding notes of each word in lyrics). |
SIWEI WU et. al. | arxiv-cs.SD | 2024-09-22 |
| 606 | Meta-Learning-Based Supervised Domain Adaptation for Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The task of extracting the dominant pitch from polyphonic audio is crucial in the music information retrieval field. A substantial amount of labeled audio data is required to … |
Kavya Ranjan Saxena; Vipul Arora; | 2024 IEEE 34th International Workshop on Machine Learning … | 2024-09-22 |
| 607 | Research on Multi-Modal Music Score Alignment Model for Online Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As music data storage becomes increasingly diverse in the era of big data, ensuring alignment of music works with the same semantics for online music education is crucial. To … |
Dexin Ren; | J. Adv. Comput. Intell. Intell. Informatics | 2024-09-20 |
| 608 | MuCodec: Ultra Low-Bitrate Music Codec Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thisissue, we propose MuCodec, specifically targeting music compression andreconstruction tasks at ultra low bitrates. |
YAOXUN XU et. al. | arxiv-cs.SD | 2024-09-20 |
| 609 | Exploring Bat Song Syllable Representations in Self-supervised Audio Encoders Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: How well can deep learning models trained on human-generated sounds distinguish between another species’ vocalization types? |
Marianne de Heer Kloots; Mirjam Knörnschild; | arxiv-cs.SD | 2024-09-19 |
| 610 | M6(GPT)3: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a genetic algorithm for thegeneration of melodic elements. |
Jakub Poćwiardowski; Mateusz Modrzejewski; Marek S. Tatara; | arxiv-cs.SD | 2024-09-19 |
| 611 | Designing Audio Processing Strategies to Enhance Cochlear Implant Users’ Music Enjoyment Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Cochlear implants (CIs) provide hundreds of thousands of users with increased access to sound, particularly speech, but experiences of music are more varied. Can greater … |
Lloyd May; Aaron Hodges; So Yeon Park; Blair Kaneshiro; Jonathan Berger; | Frontiers Comput. Sci. | 2024-09-19 |
| 612 | M6(GPT)3: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text Using Genetic Algorithms, Probabilistic Methods and GPT Models in Any Progression and Time Signature Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This work introduces the $\text{M}^\text{6}(\text{GPT})^\text{3}$ composer system, capable of generating complete, multi-minute musical compositions with complex structures in any … |
Jakub Po’cwiardowski; Mateusz Modrzejewski; Marek S. Tatara; | ArXiv | 2024-09-19 |
| 613 | FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents FruitsMusic, a metadata corpus of Japanese idol-group songs in the real world, precisely annotated with who sings what and when. |
Hitoshi Suda; Shunsuke Yoshida; Tomohiko Nakamura; Satoru Fukayama; Jun Ogata; | arxiv-cs.SD | 2024-09-19 |
| 614 | Modeling Musical Knowledge With Quantum Bayesian Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music is a multifaceted art form that requires a nuanced and comprehensive framework for analysis. This framework should encompass correlations among diverse musical attributes, … |
Florian Krebs; Hermann Fürntratt; Roland Unterberger; Franz Graf; | 2024 International Conference on Content-Based Multimedia … | 2024-09-18 |
| 615 | METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation Via Transformer VAE Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose METEOR, amodel for generating Melody-aware Texture-controllable re-Orchestration with aTransformer-based variational auto-encoder (VAE). |
Dinh-Viet-Toan Le; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-09-18 |
| 616 | Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. |
Tornike Karchkhadze; Mohammad Rasool Izadi; Shlomo Dubnov; | arxiv-cs.SD | 2024-09-18 |
| 617 | Mapping Music Onto Robot Joints for Autonomous Choreographies: PCA-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A dance choreography is the kinematic interpretation of the music that translates the sound into rhythmic and synchronized movements. The dance makes the sounds visible through … |
G. Saviano; Alberto Villani; Domenico Prattichizzo; | 2024 IEEE 8th Forum on Research and Technologies for … | 2024-09-18 |
| 618 | Evaluation of Pretrained Language Models on Music Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Music-text multimodal systems have enabled new approaches to Music Information Research (MIR) applications such as audio-to-text and text-to-audio retrieval, text-based song … |
Yannis Vasilakis; Rachel M. Bittner; Johan Pauwels; | ArXiv | 2024-09-17 |
| 619 | PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such issues highlight the need for publicly available, copyright-free musical data, in which there is a large shortage, particularly for symbolic music data. To alleviate this issue, we present PDMX: a large-scale open-source dataset of over 250K public domain MusicXML scores collected from the score-sharing forum MuseScore, making it the largest available copyright-free symbolic music dataset to our knowledge. |
Phillip Long; Zachary Novack; Taylor Berg-Kirkpatrick; Julian McAuley; | arxiv-cs.SD | 2024-09-16 |
| 620 | Unveiling and Mitigating Bias in Large Language Model Recommendations: A Path to Fairness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores the interplay between bias and LLM-based recommendation systems, focusing on music, song, and book recommendations across diverse demographic and cultural groups. |
Anindya Bijoy Das; Shahnewaz Karim Sakib; | arxiv-cs.IR | 2024-09-16 |
| 621 | MusicLIME: Explainable Multimodal Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce MusicLIME, a model-agnostic feature importance explanation method designed for multimodal music models. |
Theodoros Sotirou; Vassilis Lyberatos; Orfeas Menis Mastromichalakis; Giorgos Stamou; | arxiv-cs.SD | 2024-09-16 |
| 622 | ELMI: Interactive and Intelligent Sign Language Translation of Lyrics for Song Signing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ELMI, an accessible song-signing tool that assists in translating lyrics into sign language. |
Suhyeon Yoo; Khai N. Truong; Young-Ho Kim; | arxiv-cs.HC | 2024-09-15 |
| 623 | Prevailing Research Areas for Music AI in The Era of Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We then overview different generative models, forms of evaluating these models, and their computational constraints/limitations. Subsequently, we highlight applications of these generative models towards extensions to multiple modalities and integration with artists’ workflow as well as music education systems. |
Megan Wei; Mateusz Modrzejewski; Aswin Sivaraman; Dorien Herremans; | arxiv-cs.SD | 2024-09-14 |
| 624 | A Survey of Foundation Models for Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The rapid advancement of artificial intelligence (AI) has introduced new ways to analyze music, aiming to replicate human understanding of music and provide related services. |
WENJUN LI et. al. | arxiv-cs.SD | 2024-09-14 |
| 625 | Computational Musicking: Music + Coding As A Hybrid Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: While there is a growing body of research that explores the integration of music and coding in learning environments, much of this work has either emphasised the technical aspects … |
Cameron L. Roberts; Mike Horn; | Behaviour & Information Technology | 2024-09-13 |
| 626 | Towards Leveraging Contrastively Pretrained Neural Audio Embeddings for Recommender Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our experiments demonstrate that neural embeddings, particularly those generated with the Contrastive Language-Audio Pretraining (CLAP) model, present a promising approach to enhancing music recommendation tasks within graph-based frameworks. |
Florian Grötschla; Luca Strässle; Luca A. Lanzendörfer; Roger Wattenhofer; | arxiv-cs.SD | 2024-09-13 |
| 627 | Seed-Music: A Unified Framework for High Quality and Controlled Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. |
YE BAI et. al. | arxiv-cs.SD | 2024-09-13 |
| 628 | Bridging Paintings and Music — Exploring Emotion Based Music Generation Through Paintings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research develops a model capable of generating music that resonates with the emotions depicted in visual arts, integrating emotion labeling, image captioning, and language models to transform visual inputs into musical compositions. |
Tanisha Hisariya; Huan Zhang; Jinhua Liang; | arxiv-cs.SD | 2024-09-12 |
| 629 | Bridging Paintings and Music – Exploring Emotion Based Music Generation Through Paintings IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Rapid advancements in artificial intelligence have significantly enhanced generative tasks involving music and images, employing both unimodal and multimodal approaches. This … |
Tanisha Hisariya; Huan Zhang; Jinhua Liang; | ArXiv | 2024-09-12 |
| 630 | VMAS: Video-to-Music Generation Via Semantic Alignment in Web Music Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a framework for learning to generate background music from video inputs. |
Yan-Bo Lin; Yu Tian; Linjie Yang; Gedas Bertasius; Heng Wang; | arxiv-cs.MM | 2024-09-11 |
| 631 | A Two-Stage Band-Split Mamba-2 Network For Music Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper applies Mamba-2 with a two-stage strategy, which introduces residual mapping based on the mask method, effectively compensating for the details absent in the mask and further improving separation performance. |
Jinglin Bai; Yuan Fang; Jiajie Wang; Xueliang Zhang; | arxiv-cs.SD | 2024-09-10 |
| 632 | RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current separating methods struggle to fully remove noise or excessively suppress signal components, affecting the naturalness and similarity of the processed audio. To tackle this, our study introduces RobustSVC, a novel any-to-one SVC framework that converts noisy vocals into clean vocals sung by the target singer. |
WEI CHEN et. al. | arxiv-cs.SD | 2024-09-10 |
| 633 | An End-to-End Approach for Chord-Conditioned Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given the inaccuracy of automatic chord extractors, we devise a robust cross-attention mechanism augmented with dynamic weight sequence to integrate extracted chord information into song generations and reduce frame-level flaws, and propose a novel model termed Chord-Conditioned Song Generator (CSG) based on it. |
SHUOCHEN GAO et. al. | arxiv-cs.SD | 2024-09-10 |
| 634 | Benchmarking Sub-Genre Classification For Mainstage Dance Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We employ a continuous soft labeling approach toaccommodate tracks blending multiple sub-genres, preserving their inherentcomplexity. |
Hongzhi Shu; Xinglin Li; Hongyu Jiang; Minghao Fu; Xinyu Li; | arxiv-cs.SD | 2024-09-10 |
| 635 | Musical Chords: A Novel Java Algorithm and App Utility to Enumerate Chord-Progressions Adhering to Music Theory Guidelines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these limitations, a novel Java Algorithm and automated music theory chord progression and variations generator App has been developed. This App offers a piano user interface, that applies music theory to generate all possible four-chord and eight-chord progressions and produces three alternate variations of the generated progressions selected by the user. |
Aditya Lakshminarasimhan; | arxiv-cs.SD | 2024-09-09 |
| 636 | SongCreator: Lyrics-based Universal Song Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While various aspects of song generation have been explored by previous works, such as singing voice, vocal composition and instrumental arrangement, etc., generating songs with both vocals and accompaniment given lyrics remains a significant challenge, hindering the application of music generation models in the real world. In this light, we propose SongCreator, a song-generation system designed to tackle this challenge. |
SHUN LEI et. al. | arxiv-cs.SD | 2024-09-09 |
| 637 | Latent Diffusion Bridges for Unsupervised Musical Audio Timbre Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel method based on dual diffusion bridges, trained using the CocoChorales Dataset, which consists of unpaired monophonic single-instrument audio data. |
MICHELE MANCUSI et. al. | arxiv-cs.SD | 2024-09-09 |
| 638 | Mel-RoFormer for Vocal Separation and Vocal Melody Transcription Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Mel-RoFormer, a spectrogram-based model featuring two key designs: a novel Mel-band Projection module at the front-end to enhance the model’s capability to capture informative features across multiple frequency bands, and interleaved RoPE Transformers to explicitly model the frequency and time dimensions as two separate sequences. |
Ju-Chiang Wang; Wei-Tsung Lu; Jitong Chen; | arxiv-cs.SD | 2024-09-06 |
| 639 | Enhancing Sequential Music Recommendation with Personalized Popularity Awareness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Moreover, music consumption is characterized by a prevalence of repeated listening, i.e., users frequently return to their favourite tracks, an important signal that could be framed as individual or personalized popularity. This paper addresses these challenges by introducing a novel approach that incorporates personalized popularity information into sequential recommendation. |
Davide Abbattista; Vito Walter Anelli; Tommaso Di Noia; Craig Macdonald; Aleksandr Vladimirovich Petrov; | arxiv-cs.IR | 2024-09-06 |
| 640 | MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, existing research primarily focuses on Western music and encounters challenges in generating melodies for Chinese traditional music, especially in capturing modal characteristics and emotional expression. To address these issues, we propose a new architecture, the Dual-Feature Modeling Module, which integrates the long-range dependency modeling of the Mamba Block with the global structure capturing capabilities of the Transformer Block. |
JIATAO CHEN et. al. | arxiv-cs.SD | 2024-09-04 |
| 641 | Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This process involves composing each instrument to align with existing ones in terms of beat, dynamics, harmony, and melody, requiring greater precision and control over tracks than text prompts usually provide. In this work, we address these challenges by extending the MusicLDM, a latent diffusion model for music, into a multi-track generative model. |
Tornike Karchkhadze; Mohammad Rasool Izadi; Ke Chen; Gerard Assayag; Shlomo Dubnov; | arxiv-cs.SD | 2024-09-04 |
| 642 | Applications and Advances of Artificial Intelligence in Music Generation:A Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, artificial intelligence (AI) has made significant progress in the field of music generation, driving innovation in music creation and applications. This paper … |
Yanxu Chen; Linshu Huang; Tian Gou; | ArXiv | 2024-09-03 |
| 643 | A Progressive-Adaptive Music Generator (PAMG): An Approach to Interactive Procedural Music for Videogames Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Alvaro Eduardo Lopez Duarte; | El Farmaceutico | 2024-09-02 |
| 644 | Considerations and Concerns of Professional Game Composers Regarding Artificially Intelligent Music Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Artificially intelligent music technology (AIMT) is a promising field with great potential for creating innovation in music. However, the considerations and concerns surrounding … |
Kyle Worrall; Tom Collins; | IEEE Transactions on Games | 2024-09-01 |
| 645 | MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current techniques dedicated to symbolic music generation generally encounter two significant challenges: training data’s lack of information about chords and scales and the requirement of specially designed model architecture adapted to the unique format of symbolic music representation. In this paper, we solve the above problems by introducing new symbolic music representation with MusicLang chord analysis model. |
Jinlong Zhu; Keigo Sakurai; Ren Togo; Takahiro Ogawa; Miki Haseyama; | arxiv-cs.SD | 2024-09-01 |
| 646 | X-Singer: Code-Mixed Singing Voice Synthesis Via Cross-Lingual Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing voice synthesis (SVS) systems have exhibited a remarkable ability to synthesize natural singing voices. However, existing methods still depend on the phoneme annotation in … |
Ji-Sang Hwang; Hyeongrae Noh; Yoonseok Hong; Insoo Oh; | Interspeech 2024 | 2024-09-01 |
| 647 | Application Research of Short-Time Fourier Transform in Music Generation Based on The Parallel WaveGan System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the widespread use of Fourier transform (FT) networks and generative adversarial networks (GANs) in audio signal processing, their practical effectiveness in unsupervised … |
Jun Min; Zhiwei Gao; Lei Wang; Aihua Zhang; | IEEE Transactions on Industrial Informatics | 2024-09-01 |
| 648 | FLUX That Plays Music IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper explores a simple extension of diffusion-based rectified flow Transformers for text-to-music generation, termed as FluxMusic. |
Zhengcong Fei; Mingyuan Fan; Changqian Yu; Junshi Huang; | arxiv-cs.SD | 2024-08-31 |
| 649 | Toward A More Complete OMR Solution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we focus on the MUSCIMA++ v2.0 dataset, which represents musical notation as a graph with pairwise relationships among detected music objects, and we consider both stages together. |
Guang Yang; Muru Zhang; Lin Qiu; Yanming Wan; Noah A. Smith; | arxiv-cs.CV | 2024-08-30 |
| 650 | Research on Music Teaching Systems Assisted By Artificial Intelligence Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In music teaching, the application of artificial intelligence has brought revolutionary changes to teaching and greatly improved the teaching quality and learning effect. The … |
Kuixing Yuan; | Int. J. e Collab. | 2024-08-29 |
| 651 | Music Grounding By Short Video Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to bridge the gapbetween the practical need for music moment localization and V2MR, we propose anew task termed Music Grounding by Short Video (MGSV). |
ZIJIE XIN et. al. | arxiv-cs.MM | 2024-08-29 |
| 652 | Do Recommender Systems Promote Local Music? A Reproducibility Study Using Music Streaming Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To assess the robustness of this study’s conclusions, we conduct a comparative analysis using proprietary listening data from a global music streaming service, which we publicly release alongside this paper. |
KRISTINA MATROSOVA et. al. | arxiv-cs.IR | 2024-08-29 |
| 653 | Transformers Meet ACT-R: Repeat-Aware and Sequential Listening Session Recommendation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This is a crucial limitation for music recommendation, as repeatedly listening to the same song over time is a common phenomenon that can even change the way users perceive this song. In this paper, we introduce PISA (Psychology-Informed Session embedding using ACT-R), a session-level sequential recommender system that overcomes this limitation. |
Viet-Anh Tran; Guillaume Salha-Galvan; Bruno Sguerra; Romain Hennequin; | arxiv-cs.IR | 2024-08-29 |
| 654 | Royalty Management By Using Blockchain Network: A Multiple Case Study Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The rise of music streaming services has led to significant challenges in royalty management, particularly affecting independent musicians who face poor royalty payments and a … |
Thalea Christy Nathaniela; Elfindah Princes; Gunawan Wang; | 2024 International Conference on Information Management and … | 2024-08-28 |
| 655 | Multimodal Music Datasets? Challenges and Future Goals in Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Anna-Maria Christodoulou; Olivier Lartillot; Alexander Refsum Jensenius; | Int. J. Multim. Inf. Retr. | 2024-08-28 |
| 656 | Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a unified framework for automatic multitrack music arrangementthat enables a single pre-trained symbolic music model to handle diversearrangement scenarios, including reinterpretation, simplification, and additivegeneration. |
LONGSHEN OU et. al. | arxiv-cs.SD | 2024-08-27 |
| 657 | Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We make our implementation, pre-processing scripts, trained models, and evaluation results publicly available to support further research and development. |
Elona Shatri; George Fazekas; | arxiv-cs.IR | 2024-08-27 |
| 658 | Foundation Models for Music: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The paper offers insights into future challenges and trends on FMs for music, aiming to shape the trajectory of human-AI collaboration in the music realm. |
YINGHAO MA et. al. | arxiv-cs.SD | 2024-08-26 |
| 659 | How Robots Influence Human Perception: Investigating The Role of Body Language and Music in Emotion Perception for Social HRI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Emotional dance is an engaging and stimulating multimodal social activity involving the display of both body language and music. An interesting area of research is in the … |
Nan Liang; G. Nejat; | 2024 33rd IEEE International Conference on Robot and Human … | 2024-08-26 |
| 660 | LyCon: Lyrics Reconstruction from The Bag-of-Words Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our study introduces a novel method for generating copyright-free lyrics from publicly available Bag-of-Words (BoW) datasets, which contain the vocabulary of lyrics but not the lyrics themselves. |
Haven Kim; Kahyun Choi; | arxiv-cs.CL | 2024-08-26 |
| 661 | SONICS: Synthetic Or Not — Identifying Counterfeit Songs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Additionally, existing datasets lack music-lyrics diversity, long-duration songs, and open-access fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs (4,751 hours) with over 49k synthetic songs from popular platforms like Suno and Udio. |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; Shaikh Anowarul Fattah; | arxiv-cs.SD | 2024-08-26 |
| 662 | Envelope Based Deep Source Separation and EEG Auditory Attention Decoding for Speech and Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study focuses on auditory attention decoding (AAD) for speech and music. We propose an envelope-based deep source separation strategy on a single microphone system, where the … |
M. A. Tanveer; Jesper Jensen; Zheng-Hua Tan; Jan Østergaard; | 2024 32nd European Signal Processing Conference (EUSIPCO) | 2024-08-26 |
| 663 | Scoring Synchronization Between Music and Motion: Local Vs Global Approaches Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper compares methods for scoring the synchronization between music and motion. A wide range of local and global methods such as the Gaussian-based method, relative phase, … |
Hamza Bayd; Patrice Guyot; Benoît G. Bardy; Pierre Slangen; | 2024 32nd European Signal Processing Conference (EUSIPCO) | 2024-08-26 |
| 664 | Research on The Application of Intelligent Algorithms in The Automation of Music Generation and Composition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study explores recent advances in intelligent algorithms for the automation of music generation and arrangement, with a particular focus on the potential applications of … |
Hanxiao Ye; | 2024 International Conference on Computers, Information … | 2024-08-26 |
| 665 | SONICS: Synthetic Or Not – Identifying Counterfeit Songs IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The recent surge in AI-generated songs presents exciting possibilities and challenges. While these inventions democratize music creation, they also necessitate the ability to … |
Md Awsafur Rahman; Zaber Ibn Abdul Hakim; Najibul Haque Sarker; Bishmoy Paul; S. Fattah; | ArXiv | 2024-08-26 |
| 666 | Effects of Listening Behaviors of A Social Robot on Adult’s Motivation and Performance in Piano Practice Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The landscape of education with social robots is evolving, especially within the realm of music education. However, past studies have focused on children as music learners and … |
Ryuto Matsusaka; Masahiro Shiomi; Tetsuya Takiguchi; | 2024 33rd IEEE International Conference on Robot and Human … | 2024-08-26 |
| 667 | Teaching Indian Classical Music Using Web-Based Interactive Platform and Real-Time Audio Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music education has traditionally relied on theo-retical instruction and sheet music. However, integrating real-time audio analysis and interactive learning tools introduces a … |
ASHWIN P JOBY et. al. | 2024 32nd European Signal Processing Conference (EUSIPCO) | 2024-08-26 |
| 668 | Hybrid Music Recommendation with Graph Neural Networks IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Modern music streaming services rely on recommender systems to help users navigate within their large collections. Collaborative filtering (CF) methods, that leverage past … |
Matej Bevec; Marko Tkalcic; Matevž Pesek; | User Model. User Adapt. Interact. | 2024-08-24 |
| 669 | Fairness of Large Music Models: From A Culturally Diverse Perspective Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper explores the fairness of large music models in generating culturally diverse musical compositions, emphasizing the need for inclusivity and equity in AI-generated … |
Qinyuan Wang; Bruce Gu; He Zhang; Yunfeng Li; | 2024 IEEE 9th International Conference on Data Science in … | 2024-08-23 |
| 670 | Charting The Universe of Metal Music Lyrics and Analyzing Their Relation to Perceived Audio Hardness Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We analyze the relationship between the musical and the lyrical content of metal music by combining automated audio feature extraction and quantitative text analysis on a corpus … |
Isabella Czedik-Eysenberg; Oliver Wieczorek; Arthur Flexer; Christoph Reuter; | Trans. Int. Soc. Music. Inf. Retr. | 2024-08-22 |
| 671 | Information and Motor Constraints Shape Melodic Diversity Across Cultures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The motor constraint hypothesis accountsfor certain similarities, such as scalar motion and contour shape, but not forother major common features, such as repetition, song length, and scale size.Here we investigate the role of information constraints in shaping thesehallmarks of melodies. |
JOHN M MCBRIDE et. al. | arxiv-cs.SD | 2024-08-22 |
| 672 | Melody Predominates Over Harmony in The Evolution of Musical Scales Across 96 Countries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While recent analyses provide mixed support for arole of melody as well as harmony, we lack a comparative analysis based oncross-cultural data. We address this longstanding problem through a rigorouscomputational comparison of the main theories using 1,314 scales from 96countries. |
John M McBride; Elizabeth Phillips; Patrick E Savage; Steven Brown; Tsvi Tlusty; | arxiv-cs.SD | 2024-08-22 |
| 673 | Towards Estimating Personal Values in Song Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, as highly subjective text, song lyrics present a challenge in terms of sampling songs to be annotated, annotation methods, and in choosing a method for aggregation. In this project, we take a perspectivist approach, guided by social science theory, to gathering annotations, estimating their quality, and aggregating them. |
Andrew M. Demetriou; Jaehun Kim; Sandy Manolios; Cynthia C. S. Liem; | arxiv-cs.CL | 2024-08-22 |
| 674 | A Tighter Complexity Analysis of SparseGPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we improved the analysis of the running time of SparseGPT [Frantar, Alistarh ICML 2023] from $O(d^{3})$ to $O(d^{\omega} + d^{2+a+o(1)} + d^{1+\omega(1,1,a)-a})$ for any $a \in [0, 1]$, where $\omega$ is the exponent of matrix multiplication. |
Xiaoyu Li; Yingyu Liang; Zhenmei Shi; Zhao Song; | arxiv-cs.DS | 2024-08-22 |
| 675 | Oh, Behave! Country Representation Dynamics Created By Feedback Loops in Music Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate the dynamics of representation of local (i.e., country-specific) and US-produced music in user profiles and recommendations. |
Oleg Lesota; Jonas Geiger; Max Walder; Dominik Kowald; Markus Schedl; | arxiv-cs.IR | 2024-08-21 |
| 676 | SentHYMNent: An Interpretable and Sentiment-Driven Model for Algorithmic Melody Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce two major novel elements: a nuanced mixture-based representation for musical sentiment, including a web tool to gather data, as well as a sentiment- and theory-driven harmonization model, SentHYMNent. |
STEPHEN HAHN et. al. | kdd | 2024-08-21 |
| 677 | Weighted Singular Value Thresholding and Gradient Optimization of Unbiased Risk Estimate for Rank Estimation in Automatic Music Transcription Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Most of the research works on music transcription assume a priori knowledge regarding the number of musical notes or instruments. This paper proposes two novel algorithms for … |
Bauyrzhan Kurmangaliyev; Muhammad Tahir Akhtar; | 2024 IEEE Pacific Rim Conference on Communications, … | 2024-08-21 |
| 678 | Rage Music Classification and Analysis Using K-Nearest Neighbour, Random Forest, Support Vector Machine, Convolutional Neural Networks, and Gradient Boosting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We compare methods of classification in the application of audio analysis with machine learning and identify optimal models. |
Akul Kumar; | arxiv-cs.SD | 2024-08-20 |
| 679 | Mo�sai: Efficient Text-to-Music Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In our work, we bridge text and music via a text-to-music generation model that is highly efficient, expressive, and can handle long-term structure. |
Flavio Schneider; Ojasv Kamal; Zhijing Jin; Bernhard Sch�lkopf; | acl | 2024-08-20 |
| 680 | DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To fill the gap, we propose DisMix, a generative framework in which the pitch and timbre representations act as modular building blocks for constructing the melody and instrument of a source, and the collection of which forms a set of per-instrument latent representations underlying the observed mixture. |
YIN-JYUN LUO et. al. | arxiv-cs.SD | 2024-08-20 |
| 681 | Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel task called Text-to-Song synthesis which incorporates both vocal and accompaniment generation. |
ZHIQING HONG et. al. | acl | 2024-08-20 |
| 682 | Rhyme-aware Chinese Lyric Generator Based on GPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enhance the rhyming quality of generated lyrics, we incorporate integrated rhyme information into our model, thereby improving lyric generation performance. |
YIXIAO YUAN et. al. | arxiv-cs.CL | 2024-08-19 |
| 683 | The Evolution of Inharmonicity and Noisiness in Contemporary Popular Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we use modified MPEG-7 features to explore and characterise the evolution of noise and inharmonicity in popular music since 1961. |
Emmanuel Deruty; David Meredith; Stefan Lattner; | arxiv-cs.SD | 2024-08-15 |
| 684 | A Theory-Based Explainable Deep Learning Architecture for Music Emotion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper develops a theory-based, explainable deep learning convolutional neural network classifier to predict the time-varying emotional response to music. … |
H. Fong; Vineet Kumar; K. Sudhir; | Mark. Sci. | 2024-08-13 |
| 685 | A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With a larger corpus of Schenkerian data, it may be possible to infuse machine learning models with a deeper understanding of musical structure, thus leading to more human results. To encourage further research in Schenkerian analysis and its potential benefits for music informatics and generation, this paper presents three main contributions: 1) a new and growing dataset of SchAs, the largest in human- and computer-readable formats to date (>140 excerpts), 2) a novel software for visualization and collection of SchA data, and 3) a novel, flexible representation of SchA as a heterogeneous-edge graph data structure. |
STEPHEN NI-HAHN et. al. | arxiv-cs.SD | 2024-08-13 |
| 686 | Surveying More Than Two Decades of Music Information Retrieval Research on Playlists Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, we present an extensive survey of music information retrieval (MIR) research into music playlists. Our survey spans more than 20 years, and includes around 300 … |
Giovanni Gabbolini; Derek Bridge; | ACM Trans. Intell. Syst. Technol. | 2024-08-12 |
| 687 | Tactile Melodies: A Desk-Mounted Haptics for Perceiving Musical Experiences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel interface for experiencing music through haptic impulses to the palm of the hand. |
Raj Varshith Moora; Gowdham Prabhakar; | arxiv-cs.HC | 2024-08-12 |
| 688 | TEAdapter: Supply Abundant Guidance for Controllable Text-to-music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Accordingly, we introduce the TEAcher Adapter (TEAdapter), a compact plugin designed to guide the generation process with diverse control information provided by users. |
JIALING ZOU et. al. | arxiv-cs.SD | 2024-08-09 |
| 689 | The Algorithmic Nature of Song-sequencing: Statistical Regularities in Music Albums Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on a review of anecdotal beliefs, we explored patterns of track-sequencing within professional music albums. |
Pedro Neto; Martin Hartmann; Geoff Luck; Petri Toiviainen; | arxiv-cs.MM | 2024-08-08 |
| 690 | Quantifying The Corpus Bias Problem in Automatic Music Transcription Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We identify two primary sources of distribution shift: the music, and the sound. |
Lukáš Samuel Marták; Patricia Hu; Gerhard Widmer; | arxiv-cs.SD | 2024-08-08 |
| 691 | The Billboard Melodic Music Dataset (BiMMuDa) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We introduce the Billboard Melodic Music Dataset (BiMMuDa), which contains the lead vocal melodies of the top five songs of each year from 1950 to 2022 according to the Billboard … |
Madeline Hamilton; Ana Clemente; Edward T. R. Hall; Marcus Pearce; | Trans. Int. Soc. Music. Inf. Retr. | 2024-08-07 |
| 692 | InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we develop InstructME, an Instruction guided Music Editing and remixing framework based on latent diffusion models. |
BING HAN et. al. | ijcai | 2024-08-03 |
| 693 | Retrieval Guided Music Captioning Via Multimodal Prefixes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we put forward a new approach to music captioning, the task of automatically generating natural language descriptions for songs. |
Nikita Srivatsan; Ke Chen; Shlomo Dubnov; Taylor Berg-Kirkpatrick; | ijcai | 2024-08-03 |
| 694 | MusicMagus: Zero-Shot Text-to-Music Editing Via Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, the task of editing these generated music remains a significant challenge. This paper introduces a novel approach to edit music generated by such models, enabling the modification of specific attributes, such as genre, mood, and instrument, while maintaining other aspects unchanged. |
YIXIAO ZHANG et. al. | ijcai | 2024-08-03 |
| 695 | MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in The Field of Music IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To this end, we present MuChin, the first open-source music description benchmark in Chinese colloquial language, designed to evaluate the performance of multimodal LLMs in understanding and describing music. |
ZIHAO WANG et. al. | ijcai | 2024-08-03 |
| 696 | Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Current lyric-to-melody generation methods struggle with the lack of paired lyric-melody data to train, and the lack of adherence to composition guidelines, resulting in melodies that do not sound human-composed. To address these issues, we propose a novel paradigm called Re-creation of Creations (ROC) that combines the strengths of both rule-based and neural-based methods. |
Ang Lv; Xu Tan; Tao Qin; Tie-Yan Liu; Rui Yan; | ijcai | 2024-08-03 |
| 697 | Generating High-quality Symbolic Music Using Fine-grained Discriminators Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose to decouple the melody and rhythm from music, and design corresponding fine-grained discriminators to tackle the aforementioned issues. |
ZHEDONG ZHANG et. al. | arxiv-cs.SD | 2024-08-03 |
| 698 | Arrange, Inpaint, and Refine: Steerable Long-term Music Audio Generation and Editing Via Content-based Controls IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While Large Language Models (LLMs) have shown promise in generating high-quality music, their focus on autoregressive generation limits their utility in music editing tasks. To bridge this gap, To address this gap, we propose a novel approach leveraging a parameter-efficient heterogeneous adapter combined with a masking training scheme. |
Liwei Lin; Gus Xia; Yixiao Zhang; Junyan Jiang; | ijcai | 2024-08-03 |
| 699 | Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce the Nested Music Transformer (NMT), an architecture tailored for decoding compound tokens autoregressively, similar to processing flattened tokens, but with low memory usage. |
Jiwoo Ryu; Hao-Wen Dong; Jongmin Jung; Dasaem Jeong; | arxiv-cs.SD | 2024-08-02 |
| 700 | Music2P: A Multi-Modal AI-Driven Tool for Simplifying Album Cover Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our ultimate goal is to provide a tool that empowers musicians and producers, especially those with limited resources or expertise, to create compelling album covers. |
Joong Ho Choi; Geonyeong Choi; Ji-Eun Han; Wonjin Yang; Zhi-Qi Cheng; | arxiv-cs.MM | 2024-08-02 |
| 701 | PiCoGen2: Piano Cover Generation with Transfer Learning Approach and Weakly Aligned Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This would, however, result in the loss of piano information and accordingly cause inconsistencies between the original and remapped piano versions. To overcome this limitation, we propose a transfer learning approach that pre-trains our model on piano-only data and fine-tunes it on weakly-aligned paired data constructed without note remapping. |
Chih-Pin Tan; Hsin Ai; Yi-Hsin Chang; Shuen-Huei Guan; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-08-02 |
| 702 | Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a project that revives a piece of 15th-century Korean court music, Chihwapyeong and Chwipunghyeong, composed upon the poem Songs of the Dragon Flying to Heaven. |
DANBINAERIN HAN et. al. | arxiv-cs.SD | 2024-08-02 |
| 703 | MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, their evaluation poses considerable challenges, and it remains unclear how to effectively assess their ability to correctly interpret music-related inputs with current methods. Motivated by this, we introduce MuChoMusic, a benchmark for evaluating music understanding in multimodal language models focused on audio. |
BENNO WECK et. al. | arxiv-cs.SD | 2024-08-02 |
| 704 | METEOR: Melody-aware Texture-controllable Symbolic Music Re-Orchestration Via Transformer VAE Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Re-orchestration is the process of adapting a music piece for a different set of instruments. By altering the original instrumentation, the orchestrator often modifies the musical … |
Dinh-Viet-Toan Le; Yi-Hsuan Yang; | International Joint Conference on Artificial Intelligence | 2024-08-01 |
| 705 | ChordSync: Conformer-Based Alignment of Chord Annotations to Music Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce ChordSync, a novel conformer-based model designed to seamlessly align chord annotations with audio, eliminating the need for weak alignment. |
Andrea Poltronieri; Valentina Presutti; Martín Rocamora; | arxiv-cs.SD | 2024-08-01 |
| 706 | Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Through our baseline, we illustrate how building on top of past research can offer alternatives for music difficulty assessment which are explainable and interpretable. With this, we aim to promote a more effective communication between the Music Information Retrieval (MIR) community and the music education one. |
Pedro Ramoneda; Vsevolod Eremenko; Alexandre D’Hooge; Emilia Parada-Cabaleiro; Xavier Serra; | arxiv-cs.SD | 2024-08-01 |
| 707 | Can LLMs Reason in Music? An Evaluation of LLMs’ Capability of Music Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent research has extended the application of large language models (LLMs) such as GPT-4 and Llama2 to the symbolic music domain including understanding and generation. Yet scant research explores the details of how these LLMs perform on advanced music understanding and conditioned generation, especially from the multi-step reasoning perspective, which is a critical aspect in the conditioned, editable, and interactive human-computer co-creation process. |
ZIYA ZHOU et. al. | arxiv-cs.SD | 2024-07-31 |
| 708 | Lyrics Transcription for Humans: A Readability-Aware Benchmark Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Writing down lyrics for human consumption involves not only accurately capturing word sequences, but also incorporating punctuation and formatting for clarity and to convey … |
Ondrej Cífka; Hendrik Schreiber; Luke Miner; Fabian-Robert Stöter; | ArXiv | 2024-07-30 |
| 709 | Emotion-driven Piano Music Generation Via Two-stage Disentanglement and Functional Representation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To further capture features that shape valence, an aspect less explored by previous approaches, we introduce a novel functional representation of symbolic music. |
Jingyue Huang; Ke Chen; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-30 |
| 710 | PiCoGen: Generate Piano Covers with A Two-stage Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Cover song generation stands out as a popular way of music making in the music-creative community. In this study, we introduce Piano Cover Generation (PiCoGen), a two-stage … |
Chih-Pin Tan; Shuen-Huei Guan; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-30 |
| 711 | Emotion-Driven Melody Harmonization Via Melodic Variation and Functional Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a novel functional representation for symbolic music. |
Jingyue Huang; Yi-Hsuan Yang; | arxiv-cs.SD | 2024-07-29 |
| 712 | Futga: Towards Fine-grained Music Understanding Through Temporally-enhanced Generative Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing music captioning methods are limited to generating concise global descriptions of short music clips, which fail to capture fine-grained musical characteristics and time-aware musical changes. To address these limitations, we propose FUTGA, a model equipped with fined-grained music understanding capabilities through learning from generative augmentation with temporal compositions. |
JUNDA WU et. al. | arxiv-cs.SD | 2024-07-29 |
| 713 | Towards Music Instrument Classification Using Convolutional Neural Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recognizing musical instruments from an audio signal is a challenging yet valuable endeavor within the realm of music study. The recognition and classification of musical … |
Paul Tiemeijer; Mahyar Shahsavari; Mahmood Fazlali; | 2024 IEEE International Conference on Omni-layer … | 2024-07-29 |
| 714 | Start from Video-Music Retrieval: An Inter-Intra Modal Loss for Cross Modal Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Experiments also show that II loss improves various self-supervised and supervised uni-modal and cross-modal retrieval tasks, and can obtain good retrieval models with a small amount of training samples. |
ZEYU CHEN et. al. | arxiv-cs.MM | 2024-07-28 |
| 715 | Exploring Genre and Success Classification Through Song Lyrics Using DistilBERT: A Fun NLP Venture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a natural language processing (NLP) approach to the problem of thoroughly comprehending song lyrics, with particular attention on genre classification, … |
Servando Pizarro Martinez; Moritz Zimmermann; Miguel Serkan Offermann; Florian Reither; | ArXiv | 2024-07-28 |
| 716 | Playlists and Genre: The Role of Music Genre in Spotify’s Playlists Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: PurposeGenre is a valuable access point for popular music collections; however, the blurring of genre boundaries combined with changing listening habits and new forms of … |
Callum McDonald; A. Foster; Pauline Rafferty; | J. Documentation | 2024-07-26 |
| 717 | Simulation of Neural Responses to Classical Music Using Organoid Intelligence Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Hence, we present the PyOrganoid library, an innovative tool that facilitates the simulation of organoid learning models, integrating sophisticated machine learning techniques with biologically inspired organoid simulations. |
Daniel Szelogowski; | arxiv-cs.NE | 2024-07-25 |
| 718 | Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, editing music audios remains challenging due to the conflicting desiderata of performing fine-grained alterations on the audio while maintaining a simple user interface. To address this challenge, we propose Audio Prompt Adapter (or AP-Adapter), a lightweight addition to pretrained text-to-music models. |
FANG-DUO TSAI et. al. | arxiv-cs.SD | 2024-07-23 |
| 719 | Enhancing Mental Health With PHILOI: A Comprehensive Analysis of Mood Music and Chatbot Module Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The project aim was to develop an app that would enable the recording and monitoring of behaviour related to specific aspects of wellness, as well as support those aspects of … |
Kavita Kumavat; D. Gatagat; K. Wakode; S. Gundawar; V. Jain; | EAI Endorsed Trans. Mob. Commun. Appl. | 2024-07-22 |
| 720 | Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: A relevant discussion and related technical challenge is the potential replication and plagiarism of the training set in AI-generated music, which could lead to misuse of data and intellectual property rights violations. To tackle this issue, we present the Music Replication Assessment (MiRA) tool: a model-independent open evaluation method based on diverse audio music similarity metrics to assess data replication. |
Roser Batlle-Roca; Wei-Hisang Liao; Xavier Serra; Yuki Mitsufuji; Emilia Gómez; | arxiv-cs.SD | 2024-07-19 |
| 721 | Reducing Barriers to The Use of Marginalised Music Genres in AI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: XAI opportunities identified included topics of improving transparency and control of AI models, explaining the ethics and bias of AI models, fine tuning large models with small datasets to reduce bias, and explaining style-transfer opportunities with AI models. Participants in the research emphasised that whilst it is hard to work with small datasets such as marginalised music and AI, such approaches strengthen cultural representation of underrepresented cultures and contribute to addressing issues of bias of deep learning models. |
Nick Bryan-Kinns; Zijin Li; | arxiv-cs.SD | 2024-07-18 |
| 722 | Audio Conditioning for Music Generation Via Discrete Bottleneck Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For the second model we train a music language model from scratch jointly with a text conditioner and a quantized audio feature extractor. |
Simon Rouard; Yossi Adi; Jade Copet; Axel Roebel; Alexandre Défossez; | arxiv-cs.SD | 2024-07-17 |
| 723 | A New Model of Vocal Music Teaching in The Context of Internet Distance Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As Internet technology evolves, distance learning emerges as a pivotal mode of education. In music education, vocal teaching faces limitations in traditional face-to-face methods. … |
Xiaochen Zhang; Junkai Zhang; | Int. J. Web Based Learn. Teach. Technol. | 2024-07-17 |
| 724 | GraphMuse: A Library for Symbolic Music Graph Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Graph Neural Networks (GNNs) have recently gained traction in symbolic music tasks, yet a lack of a unified framework impedes progress. Addressing this gap, we present GraphMuse, a graph processing framework and library that facilitates efficient music graph processing and GNN training for symbolic music tasks. |
Emmanouil Karystinaios; Gerhard Widmer; | arxiv-cs.SD | 2024-07-17 |
| 725 | Harmonic Frequency-Separable Transformer for Instrument-Agnostic Music Transcription Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automatic Music Transcription (AMT) aims to convert music audio into symbolic representations. Recently, transformer-based methods have been successfully applied to … |
YULUN WU et. al. | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 726 | Multitrack Emotion-Based Music Generation Network Using Continuous Symbolic Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Musicians need an efficient composition process to yield a multitude of musical pieces. To enhance the artistry and emotional relevance of AI-generated music, a novel Multi-Track … |
DONGHUI ZHANG et. al. | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 727 | Popular Hooks: A Multimodal Dataset of Musical Hooks for Music Understanding and Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Internet is rich in unimodal music data, available in either symbolic or audio representations. However, there is a notable scarcity of multimodal music datasets that offer … |
Xinda Wu; Jiaming Wang; Jiaxing Yu; Tieyao Zhang; Kejun Zhang; | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 728 | TEAdapter: Supply Vivid Guidance for Controllable Text-to-Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Although current text-guided music generation technology can cope with simple creative scenarios, achieving finegrained control over individual text-modality conditions remains … |
JIALING ZOU et. al. | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 729 | RevNet: A Review Network with Group Aggregation Fusion for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Singing melody extraction (SME) is a critical task in the field of music information retrieval (MIR). Recently, deep learning based methods have achieved remarkable successes for … |
Shuai Yu; Xiaoliang He; Yanting Zhang; | 2024 IEEE International Conference on Multimedia and Expo … | 2024-07-15 |
| 730 | A User-Guided Generation Framework for Personalized Music Synthesis Using Interactive Evolutionary Computation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The development of generative artificial intelligence (AI) has demonstrated notable advancements in the domain of music synthesis. However, a perceived lack of creativity in the … |
Yanan Wang; Yan Pei; Zerui Ma; Jianqiang Li; | GECCO Companion | 2024-07-14 |
| 731 | Striking The Right Chord: A Comprehensive Approach to Amazon Music Search Spell Correction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we build a multi-stage framework for spell correction solution for music, media and named entity heavy search engines. |
Siddharth Sharma; Shiyun Yang; Ajinkya Walimbe; Tarun Sharma; Joaquin Delgado; | sigir | 2024-07-14 |
| 732 | The Interpretation Gap in Text-to-Music Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a framework to describe the musical interaction process, which includes expression, interpretation, and execution of controls. |
Yongyi Zang; Yixiao Zhang; | arxiv-cs.SD | 2024-07-14 |
| 733 | A Preliminary Investigation on Flexible Singing Voice Synthesis Through Decomposed Framework with Inferrable Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As collecting large singing datasets labeled with music scores is an expensive task, we investigate an alternative approach by decomposing the SVS system and inferring different singing voice features. |
Lester Phillip Violeta; Taketo Akama; | arxiv-cs.SD | 2024-07-12 |
| 734 | Music Proofreading with RefinPaint: Where and How to Modify Compositions Given Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose RefinPaint, an iterative technique that improves the sampling process. |
Pedro Ramoneda; Martin Rocamora; Taketo Akama; | arxiv-cs.SD | 2024-07-12 |
| 735 | Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale Adversarial Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This bias often arises when masked tokens cannot be inferred from their context, forcing the model to overfit the training set instead of generalizing. To address this challenge, we propose Adversarial-MidiBERT for SMU, which adaptively determines what to mask during MLM via a masker network, rather than employing random masking. |
Zijian Zhao; | arxiv-cs.SD | 2024-07-11 |
| 736 | From Real to Cloned Singer Identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: As such, methods to identify the original singer in synthetic voices are needed. In this paper, we investigate how singer identification methods could be used for such a task. |
Dorian Desblancs; Gabriel Meseguer-Brocal; Romain Hennequin; Manuel Moussallam; | arxiv-cs.SD | 2024-07-11 |
| 737 | Music Genre Classification Using Contrastive Dissimilarity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the digital age, streaming platforms have revolutionized how we access and interact with music, highlighting the need for more intuitive ways to organize and categorize our … |
Gabriel Henrique Costanzi; Lucas O. Teixeira; G. Felipe; George D. C. Cavalcanti; Yandre M. G. Costa; | 2024 31st International Conference on Systems, Signals and … | 2024-07-09 |
| 738 | XGBoost-based Music Emotion Recognition with Emobase Emotional Features Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The growing integration of music into daily life, especially for relaxation and stress relief, highlights the need for innovative methods to analyze and categorize emotional … |
Pyi Bhone Kyaw; Li Cho; | 2024 International Conference on Consumer Electronics – … | 2024-07-09 |
| 739 | Music Era Recognition Using Supervised Contrastive Learning and Artist Information Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We formulate the task as a music classification problem and propose solutions based on supervised contrastive learning. |
QIQI HE et. al. | arxiv-cs.SD | 2024-07-07 |
| 740 | MelodyVis: Visual Analytics for Melodic Patterns in Sheet Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we present MelodyVis, a visual application designed in collaboration with musicology experts to explore melodic patterns in digital sheet music. |
MATTHIAS MILLER et. al. | arxiv-cs.HC | 2024-07-07 |
| 741 | Exploring Real-Time Music-to-Image Systems for Creative Inspiration in Music Creation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a study on the use of a real-time music-to-image system as a mechanism to support and inspire musicians during their creative process. |
Meng Yang; Maria Teresa Llano; Jon McCormack; | arxiv-cs.HC | 2024-07-07 |
| 742 | PAGURI: A User Experience Study of Creative Interaction with Text-to-music Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We developedan online tool through which users can generate music samples and/or applyrecently proposed personalization techniques based on fine-tuning to allow thetext-to-music model to generate sounds closer to their needs and preferences.Using semi-structured interviews, we analyzed different aspects related to howparticipants interacted with the proposed tool to understand the currenteffectiveness and limitations of text-to-music models in enhancing users’creativity. |
Francesca Ronchini; Luca Comanducci; Gabriele Perego; Fabio Antonacci; | arxiv-cs.SD | 2024-07-05 |
| 743 | MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation Through Pre-Training and Counterfactual Loss Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The model often fails to respond adequately to new, fine-grained bar-level control signals. To address this, we propose two innovative solutions. |
Yangyang Shu; Haiming Xu; Ziqin Zhou; Anton van den Hengel; Lingqiao Liu; | arxiv-cs.SD | 2024-07-05 |
| 744 | MUSIC-lite: Efficient MUSIC Using Approximate Computing: An OFDM Radar Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MUSIC-lite, which exploits approximate computing to generate a design space exploring accuracy-area-power trade-offs. |
Rajat Bhattacharjya; Arnab Sarkar; Biswadip Maity; Nikil Dutt; | arxiv-cs.AR | 2024-07-05 |
| 745 | MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel task of Colloquial Description-to-Song Generation, which focuses on aligning the generated content with colloquial human expressions. |
ZIHAO WANG et. al. | arxiv-cs.SD | 2024-07-03 |
| 746 | MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In the domain of symbolic music research, the progress of developing scalable systems has been notably hindered by the scarcity of available training data and the demand for models tailored to specific tasks. To address these issues, we propose MelodyT5, a novel unified framework that leverages an encoder-decoder architecture tailored for symbolic music processing in ABC notation. |
Shangda Wu; Yashan Wang; Xiaobing Li; Feng Yu; Maosong Sun; | arxiv-cs.SD | 2024-07-02 |
| 747 | Music Teaching Strategy and Educational Resource Sharing Based on Big Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid development of the information age, big data technology has been widely penetrated into various industries, and has brought profound impact on its structure and … |
Lixin Sun; Qiuying Wang; | Journal of Computational Methods in Science and Engineering | 2024-07-01 |
| 748 | Using Artificial Intelligence to Analyze and Classify Music Emotion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid development of music digitization and online streaming services, automatic analysis and classification of music content has become an urgent need. This research … |
Hongyu Liu; | Journal of Computational Methods in Science and Engineering | 2024-07-01 |
| 749 | Double Music Recommendation Algorithm Based on Multi-label Propagation Hierarchical Clustering Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To enhance the precision of the music recommendation environment system, a novel design approach has been introduced, utilizing multi-label propagation and hierarchical clustering … |
Yun Peng; | Journal of Computational Methods in Science and Engineering | 2024-07-01 |
| 750 | Ideary: Facilitating Electronic Music Creation with Generative AI Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This project explores how generative AI can assist electronic music producers in composing rhythms and melodies. Based on detailed user research with electronic music producers, a … |
Adam Hollmén Larsen; Jichen Zhu; | Companion Publication of the 2024 ACM Designing Interactive … | 2024-07-01 |
| 751 | Remote Rhythms: Audience-informed Insights for Designing Remote Music Performances Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper examines the design of technology for remote music performances, from the perspective of their audiences. In this process, we involved a total of 104 participants … |
Sophia Ppali; Max Scorer; Elena Ppali; Boyd Branch; Alexandra Covaci; | Proceedings of the 2024 ACM Designing Interactive Systems … | 2024-07-01 |
| 752 | Spectral and Pitch Components of CQT Spectrum for Emotion Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With increasing technological advancements, man’s dependence on machines is also growing. Thus, effective Speech Emotion Recognition (SER) is vital for good human-machine … |
S. Uthiraa; Akshat Vora; Prathamesh Bonde; Aditya Pusuluri; Hemant A. Patil; | 2024 International Conference on Signal Processing and … | 2024-07-01 |
| 753 | Novice-Centered Application Design for Music Creation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the field of music production, the integration of machine learning-based technologies, particularly in the form of intelligent digital audio workstations and smart plugins, has … |
Atsuya Kobayashi; Tetsuro Sato; Kei Tateno; | Companion Publication of the 2024 ACM Designing Interactive … | 2024-07-01 |
| 754 | Pictures Of MIDI: Controlled Music Generation Via Graphical Prompts for Image-Based Diffusion Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores a user-friendly graphical interface enabling the drawing of masked regions for inpainting by an Hourglass Diffusion Transformer (HDiT) model trained on MIDI piano roll images. |
Scott H. Hawley; | arxiv-cs.SD | 2024-07-01 |
| 755 | Emotion-Conditioned MusicLM: Enhancing Emotional Resonance in Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Nowadays, most music generation models are limited to accepting conditions from a single modality, whether text-based or neural signal-based. For most users, text represents the … |
Yuelang Sun; Matthew Kuo; Xiaodan Wang; Weihua Li; Quan-wei Bai; | 2024 IEEE Congress on Evolutionary Computation (CEC) | 2024-06-30 |
| 756 | RPA-SCD: Rhythm and Pitch Aware Dual-Branch Network for Songs Conversion Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Song voice conversion tools have gained more and more popularity in the recent past. People have been uploading their self-made forgery songs on video websites, and these songs … |
Mingshan Du; Hongxia Wang; Rui Zhang; Zihan Yan; | 2024 International Joint Conference on Neural Networks … | 2024-06-30 |
| 757 | A Combination of LERT and CNN-BILSTM Models for Chinese Music Named Entity Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Named Entity Recognition (NER) is an important task in the field of natural language processing(NLP), serving as the foundation for knowledge graphs, recommendation systems, … |
Chaoguo Wang; Liang Zhang; Wei Yan; | 2024 International Joint Conference on Neural Networks … | 2024-06-30 |
| 758 | Harmonizing Tradition with Technology: Using AI in Traditional Music Preservation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Traditional music plays a unique role in preserving our history, connecting us to our roots, and fostering a sense of identity and continuity in a rapidly changing world. However, … |
Tiexin Yu; Xinxia Wang; Xu Xiao; Rongshan Yu; | 2024 International Joint Conference on Neural Networks … | 2024-06-30 |
| 759 | Open Edirom: From Hybrid Music Edition to Open Data Publication Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The OPEN Edirom project is developing a digital edition of incidental music for Goethe’s play Faust, representing an innovative initiative within the realm of music philology and … |
Lena Frömmel; Tobias Bachmann; Anna Plaksin; Andreas Münzmay; | Proceedings of the 11th International Conference on Digital … | 2024-06-27 |
| 760 | Svara-forms and Coarticulation in Carnatic Music: An Investigation Using Deep Clustering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Across musical genres worldwide, there are many styles where the shortest conceptual units (e.g., notes) are often performed with ornamentation rather than as static pitches. … |
Thomas Nuttall; Xavier Serra; Lara Pearson; | Proceedings of the 11th International Conference on Digital … | 2024-06-27 |
| 761 | An Online Tool for Semi-Automatically Annotating Music Scores for Optical Music Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The paper describes an online tool, OMRAT, for semi-automatic annotation of music scores for Optical Music Recognition (OMR) systems. OMRAT uses deep neural networks, machine … |
Stanisław Graczyk; Zuzanna Piniarska; Mateusz Kałamoniak; Tomasz Łukaszewski; Ewa Łukasik; | Proceedings of the 11th International Conference on Digital … | 2024-06-27 |
| 762 | Acoustic Classification of Guitar Tunings with Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A guitar tuning is the allocation of pitches to the open strings of the guitar. A wide variety of guitar tunings are featured in genres such as blues, classical, folk, and rock. … |
Edward Hulme; David Marshall; K. Sidorov; Andrew Jones; | Proceedings of the 11th International Conference on Digital … | 2024-06-27 |
| 763 | Subtractive Training for Music Stem Insertion Using Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Subtractive Training, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. |
IVAN VILLA-RENTERIA et. al. | arxiv-cs.SD | 2024-06-27 |
| 764 | PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Learning musical structures and composition patterns is necessary for both music generation and understanding, but current methods do not make uniform use of learned features to … |
XIAO LIANG et. al. | 2024 IEEE International Conference on Multimedia and Expo … | 2024-06-26 |
| 765 | Deep Learning-Enhanced Emotion-Based Music System with Age and Language Personalization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music plays an important role in understanding human mood, and the facial expressions acts as a quick way of non-linguistic communication. The emotion-based music playing system … |
Kariveda Trisha; Padigela Srinithya Reddy; Nichenametla Hima Sree; Tripty Singh; Mansi Sharma; | 2024 15th International Conference on Computing … | 2024-06-24 |
| 766 | Music Genre Classification Using CNN and MobileNetV2 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper represents a significant advancement in the field of music genre classification and recommendation by harnessing the power of MobileNet’s transfer learning capabilities … |
G. RAMESH et. al. | 2024 15th International Conference on Computing … | 2024-06-24 |
| 767 | Reflection Across AI-based Music Composition IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Reflection is fundamental to creative practice. However, the plurality of ways in which people reflect when using AI Generated Content (AIGC) is underexplored. This paper takes … |
COREY FORD et. al. | Proceedings of the 16th Conference on Creativity & Cognition | 2024-06-23 |
| 768 | Analysing The Effectiveness of Online Digital Audio Software and Offline Audio Studios in Fostering Chinese Folk Music Composition Skills in Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Object: This study was designed to compare the effectiveness of online digital audio software Logic Pro X and a university offline audio studio in terms of the perspective and … |
Xiaowei Lei; | J. Comput. Assist. Learn. | 2024-06-22 |
| 769 | Integrating Sentiment Features in Factorization Machines: Experiments on Music Recommender Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recommender systems play a pivotal role in catering to diverse user preferences and fostering personalized listening experiences. At the same time, sentiments can profoundly … |
Javier Wang; Alejandro Bellogín; Iván Cantador; | Proceedings of the 32nd ACM Conference on User Modeling, … | 2024-06-22 |
| 770 | The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: With ZIQI-Eval, we aim to provide a standardized and robust evaluation framework that facilitates a comprehensive assessment of LLMs’ music-related abilities. |
JIAJIA LI et. al. | arxiv-cs.SD | 2024-06-22 |
| 771 | Emotion-Based Music Recommendation from Quality Annotations and Large-Scale User-Generated Tags Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Emotions constitute an important aspect when listening to music. While manual annotations from user studies grounded in psychological research on music and emotions provide a … |
MARTA MOSCATI et. al. | Proceedings of the 32nd ACM Conference on User Modeling, … | 2024-06-22 |
| 772 | Mustango: Toward Controllable Text-to-Music Generation IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose Mustango: a music-domain-knowledge-inspired text-to-music system based on diffusion. |
JAN MELECHOVSKY et. al. | naacl | 2024-06-20 |
| 773 | Emotion-aware Personalized Music Recommendation with A Heterogeneity-aware Deep Bayesian Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this article, we propose four types of heterogeneity that an EMRS should account for: emotion heterogeneity across users, emotion heterogeneity within a user, music mood preference heterogeneity across users, and music mood preference heterogeneity within a user. |
ERKANG JING et. al. | arxiv-cs.AI | 2024-06-20 |
| 774 | Unveiling Emotions and Themes in Thai Songs Via Topic Modeling Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a comprehensive analysis of Thai song lyrics through the application of Latent Dirichlet Allocation (LDA), exploring the thematic and emotional landscapes … |
Panupong Wanjantuk; Niyom Pinitkarn; Sonephanom Sengthavideth; Jirat Matthayomnan; Anupap Meesomboon; | 2024 21st International Joint Conference on Computer … | 2024-06-19 |
| 775 | Information Seeking Behaviour in Music Conductors’ Repertoire Selection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Introduction. Music repertoire selection is a process driven by music conductors. They focus on scoring, ensemble composition, acquisition methods (i.e., acquiring the music). … |
Christina Firkins; Michael Barrett-Berg; Ina Fourie; | Inf. Res. | 2024-06-18 |
| 776 | MusicScore: A Dataset for Music Score Modeling and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose MusicScore, a large-scale music score dataset collected and processed from the International Music Score Library Project (IMSLP). |
Yuheng Lin; Zheqi Dai; Qiuqiang Kong; | arxiv-cs.MM | 2024-06-17 |
| 777 | Finding Resilience Through Music for Neurodivergent Children Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research presents co-design of a music-making robotic interaction with the goal of supporting self-regulation and creative expression for children who identify as … |
Harkirat Kaur; | Proceedings of the 23rd Annual ACM Interaction Design and … | 2024-06-17 |
| 778 | Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present JASCO, a temporally controlled text-to-music generation model utilizing both symbolic and audio-based conditions. |
Or Tal; Alon Ziv; Itai Gat; Felix Kreuk; Yossi Adi; | arxiv-cs.SD | 2024-06-16 |
| 779 | A Bayesian Drift-Diffusion Model of Schachter-Singer’s Two Factor Theory of Emotion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we adopt the same Bayesian framework to model emotion process in accordance with Schachter-Singer’s Two-Factor theory, which argues that emotion is the outcome of cognitive labeling or attribution of a diffuse pattern of autonomic arousal (Schachter & Singer, 1962). |
Lance Ying; Audrey Michal; Jun Zhang; | arxiv-cs.CE | 2024-06-16 |
| 780 | Diff-BGM: A Diffusion Model for Video Background Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose to align the video and music sequentially by introducing a segment-aware cross-attention layer. |
Sizhe Li; Yiming Qin; Minghang Zheng; Xin Jin; Yang Liu; | cvpr | 2024-06-13 |
| 781 | Music Time Signature Detection Using ResNet18 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Time signature detection is a fundamental task in music information retrieval, aiding in music organization. In recent years, the demand for robust and efficient methods in music … |
Jeremiah Abimbola; Daniel Kostrzewa; P. Kasprowski; | EURASIP J. Audio Speech Music. Process. | 2024-06-13 |
| 782 | MeLFusion: Synthesizing Music from Image and Language Cues Using Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: MeLFusion is a text-to-music diffusion model with a novel "visual synapse" which effectively infuses the semantics from the visual modality into the generated music. To facilitate research in this area we introduce a new dataset MeLBench and propose a new evaluation metric IMSM. |
Sanjoy Chowdhury; Sayan Nag; K J Joseph; Balaji Vasan Srinivasan; Dinesh Manocha; | cvpr | 2024-06-13 |
| 783 | MuseChat: A Conversational Music Recommendation System for Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Their inability to interact with users for further refinements or to provide explanations leads to a less satisfying experience. We address these issues with MuseChat a first-of-its-kind dialogue-based recommendation system that personalizes music suggestions for videos. |
Zhikang Dong; Xiulong Liu; Bin Chen; Pawel Polak; Peng Zhang; | cvpr | 2024-06-13 |
| 784 | Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we provide a comprehensive overview ofthe available music-emotion datasets and discuss evaluation standards as wellas competitions in the field. |
Jaeyong Kang; Dorien Herremans; | arxiv-cs.SD | 2024-06-13 |
| 785 | Controllable Dance Generation with Style-Guided Motion Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce Flexible Dance Generation with Style Description Prompts (DGSDP), a diffusion-based framework suitable for diversified tasks of dance generation by fully leveraging the semantics of music style. |
HONGSONG WANG et. al. | arxiv-cs.CV | 2024-06-12 |
| 786 | Emotion Manipulation Through Music – A Deep Learning Interactive Visual Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music evokes emotion in many people. We introduce a novel way to manipulate the emotional content of a song using AI tools. Our goal is to achieve the desired emotion while … |
Adel N. Abdalla; Jared Osborne; Răzvan Andonie; | 2024 28th International Conference Information … | 2024-06-12 |
| 787 | Emotion Manipulation Through Music — A Deep Learning Interactive Visual Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel way to manipulate the emotional content of a song using AI tools. |
Adel N. Abdalla; Jared Osborne; Razvan Andonie; | arxiv-cs.SD | 2024-06-12 |
| 788 | TokSing: Singing Voice Synthesis Based on Discrete Tokens IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce TokSing, a discrete-based SVS system equipped with a token formulator that offers flexible token blendings. |
YUNING WU et. al. | arxiv-cs.SD | 2024-06-12 |
| 789 | MusicFlow: Cascaded Flow Matching for Text Guided Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MusicFlow, a cascaded text-to-music generation model based on flow matching. |
K R PRAJWAL et. al. | icml | 2024-06-12 |
| 790 | LLark: A Multimodal Instruction-Following Language Model for Music IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present LLark, an instruction-tuned multimodal model for \emph{music} understanding. |
Joshua P Gardner; Simon Durand; Daniel Stoller; Rachel M Bittner; | icml | 2024-06-12 |
| 791 | Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we explore two zero-shot editing techniques for audio signals, which use DDPM inversion with pre-trained diffusion models. |
Hila Manor; Tomer Michaeli; | icml | 2024-06-12 |
| 792 | DITTO: Diffusion Inference-Time T-Optimization for Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Diffusion Inference-Time T-Optimization (DITTO), a general-purpose framework for controlling pre-trained text-to-music diffusion models at inference-time via optimizing initial noise latents. |
Zachary Novack; Julian McAuley; Taylor Berg-Kirkpatrick; Nicholas J. Bryan; | icml | 2024-06-12 |
| 793 | Social Music Discovery: An Ethical Recommendation System Based on Friend’s Preferred Songs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music recommendation systems have become ubiquitous in today’s world, but they raise ethical concerns related to bias, discrimination, and lack of transparency. To address these … |
Marco Furini; Francesca Fragnelli; | Multim. Tools Appl. | 2024-06-11 |
| 794 | MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present the MOSA (Music mOtion with Semantic Annotation) dataset, which contains high quality 3-D motion capture data, aligned audio recordings, and note-by-note semantic annotations of pitch, beat, phrase, dynamic, articulation, and harmony for 742 professional music performances by 23 professional musicians, comprising more than 30 hours and 570 K notes of data. |
YU-FEN HUANG et. al. | arxiv-cs.SD | 2024-06-10 |
| 795 | VGGISH For Music/Speech Classification In Radio Broadcasting IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the realm of audio signal processing, distinguishing between music and speech poses a significant challenge due to the nuanced similarities and complexities inherent in both … |
Salvatore Serrano; Marco Scarpa; Omar Serghini; | European Conference on Modelling and Simulation | 2024-06-07 |
| 796 | Innovations in Cover Song Detection: A Lyrics-Based Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Cover songs are alternate versions of a song by a different artist. Long being a vital part of the music industry, cover songs significantly influence music culture and are … |
Maximilian Balluff; Peter Mandl; Christian Wolff; | ArXiv | 2024-06-06 |
| 797 | STraDa: A Singer Traits Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The annotated-strada consists of two hundred tracks that are balanced in terms of 2 genders, 5 languages, and 4 age groups. To show its use for model training and bias analysis thanks to its metadata’s richness and downloadable audio files, we benchmarked singer sex classification (SSC) and conducted bias analysis. |
Yuexuan Kong; Viet-Anh Tran; Romain Hennequin; | arxiv-cs.SD | 2024-06-06 |
| 798 | Negative Feedback for Music Personalization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show the benefits of using real negative feedback both as inputs into the user sequence and also as negative targets for training a next-song recommender system for internet radio. |
M. Jeffrey Mei; Oliver Bembom; Andreas F. Ehmann; | arxiv-cs.LG | 2024-06-06 |
| 799 | How The Emotional Content of Music Affects Player Behaviour and Experience in Video Games Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Previous research studying music’s effect on video games has focused on musical properties, such as tempo, to create particular emotional player experiences. However, music is … |
Joshua Roberts; Jason Wuertz; Max V. Birk; Scott Bateman; Daniel J. Rea; | 2024 IEEE Gaming, Entertainment, and Media Conference (GEM) | 2024-06-05 |
| 800 | Intelligent Text-Conditioned Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this project, we apply a similar approach to bridge the gap between natural language and music. |
Zhouyao Xie; Nikhil Yadala; Xinyi Chen; Jing Xi Liu; | arxiv-cs.MM | 2024-06-02 |
| 801 | Topological Querying of Music Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
P. Rigaux; Virginie Thion; | Data Knowl. Eng. | 2024-06-01 |
| 802 | May The Dance Be with You: Dance Generation Framework for Non-Humanoids Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: If an agent can recognize the relationship between visual rhythm and music, it will be able to dance by generating a motion to create a visual rhythm that matches the music. Based on this, we propose a framework for any kind of non-humanoid agents to learn how to dance from human videos. |
Hyemin Ahn; | arxiv-cs.CV | 2024-05-30 |
| 803 | DanceCraft: A Music-Reactive Real-time Dance Improv System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automatic generation of 3D dance motion, in response to live music, is a challenging task. Prior research has assumed that either the entire music track, or a significant chunk of … |
Ruilin Xu; Vu An Tran; Shree K. Nayar; Gurunandan Krishnan; | Proceedings of the 9th International Conference on Movement … | 2024-05-30 |
| 804 | DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Distilled Diffusion Inference-Time T -Optimization (or DITTO-2), a new method to speed up inference-time optimization-based control and unlock faster-than-real-time generation for a wide-variety of applications such as music inpainting, outpainting, intensity, melody, and musical structure control. |
Zachary Novack; Julian McAuley; Taylor Berg-Kirkpatrick; Nicholas Bryan; | arxiv-cs.SD | 2024-05-30 |
| 805 | CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Dance and music are intimately interconnected, with group dance being a crucial part of dance artistry. Consequently, Music-Driven Group Dance Generation has been a fundamental … |
KAIXING YANG et. al. | Proceedings of the 2024 International Conference on … | 2024-05-30 |
| 806 | Socially-Motivated Music Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Extensive literature spanning psychology, sociology, and musicology has sought to understand the motivations for why people listen to music, including both individually and … |
Benjamin Lacker; Samuel F. Way; | International Conference on Web and Social Media | 2024-05-28 |
| 807 | Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models Via Instruction Tuning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Recent advances in text-to-music editing, which employ text queries to modifymusic (e.g.\ by changing its style or adjusting instrumental components),present unique challenges and opportunities for AI-assisted music creation.Previous approaches in this domain have been constrained by the necessity totrain specific editing models from scratch, which is both resource-intensiveand inefficient; other research uses large language models to predict editedmusic, resulting in imprecise audio reconstruction. To Combine the strengthsand address these limitations, we introduce Instruct-MusicGen, a novel approachthat finetunes a pretrained MusicGen model to efficiently follow editinginstructions such as adding, removing, or separating stems. |
YIXIAO ZHANG et. al. | arxiv-cs.SD | 2024-05-28 |
| 808 | Enhancing Music Genre Classification Through Multi-Algorithm Analysis and User-Friendly Visualization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The aim of this study is to teach an algorithm how to recognize different types of music. |
Navin Kamuni; Dheerendra Panwar; | arxiv-cs.SD | 2024-05-27 |
| 809 | Binding Text, Images, Graphs, and Audio for Music Representation Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the field of Information Retrieval and Natural Language Processing, text embeddings play a significant role in tasks such as classification, clustering, and topic modeling. … |
Abdulrahman Tabaza; Omar Quishawi; Abdelrahman Yaghi; Omar Qawasmeh; | Proceedings of the Cognitive Models and Artificial … | 2024-05-25 |
| 810 | QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Text-to-music (TTM) generation, which converts textual descriptions into audio, opens up innovative avenues for multimedia creation. Achieving high quality and diversity in this … |
CHANG LI et. al. | International Joint Conference on Artificial Intelligence | 2024-05-24 |
| 811 | Quality-aware Masked Diffusion Transformer for Enhanced Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Most open-source datasets frequently suffer from issues like low-quality waveforms and low text-audio consistency, hindering the advancement of music generation models. To address these challenges, we propose a novel quality-aware training paradigm for generating high-quality, high-musicality music from large-scale, quality-imbalanced datasets. |
CHANG LI et. al. | arxiv-cs.SD | 2024-05-24 |
| 812 | Quality-aware Masked Diffusion Transformer for Enhanced Music Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: In recent years, diffusion-based text-to-music (TTM) generation has gained prominence, offering an innovative approach to synthesizing musical content from textual descriptions. … |
CHANG LI et. al. | ArXiv | 2024-05-24 |
| 813 | The Rarity of Musical Audio Signals Within The Space of Possible Audio Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A white noise signal can access any possible configuration of values, though statistically over many samples tends to a uniform spectral distribution, and is highly unlikely to … |
Nick Collins; | arxiv-cs.SD | 2024-05-23 |
| 814 | SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models By Searching Up-to-Date Internet Knowledge IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a plug-and-play framework, for augmenting existing LVLMs in handling visual question answering (VQA) about up-to-date knowledge, dubbed SearchLVLMs. |
CHUANHAO LI et. al. | arxiv-cs.CV | 2024-05-23 |
| 815 | Music Genre Classification: Training An AI Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this research I explore various machine learning algorithms for the purpose of music genre classification, using features extracted from audio signals.The systems are namely, a Multilayer Perceptron (built from scratch), a k-Nearest Neighbours (also built from scratch), a Convolutional Neural Network and lastly a Random Forest wide model. |
Keoikantse Mogonediwa; | arxiv-cs.SD | 2024-05-23 |
| 816 | Influence of The Development of Internet Big Data on College Students’ Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Under the background of globalisation and diversification, how to construct a music culture with national characteristics and develop music education with both diversity and … |
Yan Wang; | Int. J. Inf. Syst. Supply Chain Manag. | 2024-05-22 |
| 817 | What Makes A Viral Song? Unraveling Music Virality Factors Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The viral phenomenon is present in several contexts, combining the advantages of streaming platforms and other social networks. Music is no exception. Viral songs are widely … |
Gabriel P. Oliveira; Ana Paula Couto da Silva; Mirella M. Moro; | Proceedings of the 16th ACM Web Science Conference | 2024-05-21 |
| 818 | SYMPLEX: Controllable Symbolic Music Generation Using Simplex Diffusion with Vocabulary Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a new approach for fast and controllable generation of symbolic music based on the simplex diffusion, which is essentially a diffusion process operating on probabilities rather than the signal space. |
Nicolas Jonason; Luca Casini; Bob L. T. Sturm; | arxiv-cs.SD | 2024-05-21 |
| 819 | A Genre-Based Analysis of New Music Streaming at Scale Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The rise of on-demand music streaming platforms and novel recommendation algorithms have brought a transformative shift in music listening, where users have an effectively endless … |
Julie Jiang; Aditya Ponnada; Ang Li; Benjamin Lacker; Samuel F. Way; | Proceedings of the 16th ACM Web Science Conference | 2024-05-21 |
| 820 | A Dataset and Baselines for Measuring and Predicting The Music Piece Memorability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity. Inspired by this phenomenon, we focus on measuring and predicting music memorability. |
Li-Yang Tseng; Tzu-Ling Lin; Hong-Han Shuai; Jen-Wei Huang; Wen-Whei Chang; | arxiv-cs.IR | 2024-05-21 |
| 821 | Modeling User Attention in Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we naturally propose modeling user attention prediction as a positive-unlabeled (PU) learning problem, where active feedback is treated as positive samples and passive feedback is treated as unlabeled samples, as we can only ensure that the user’s attention is focused when she provides active feedback. |
SUNHAO DAI et. al. | icde | 2024-05-19 |
| 822 | First-mover Advantage in Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Why do some songs and musicians become successful while others do not? We show that one of the reasons may be the “first-mover advantage”: artists that stand at the foundation of … |
Oleg Sobchuk; Mason Youngblood; Olivier Morin; | EPJ Data Science | 2024-05-17 |
| 823 | Construction and Implementation of Content-Based National Music Retrieval Model Under Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research mainly studies the construction and implementation of the content-based folk music retrieval model. Firstly, it studies the music automatic annotation method based … |
Jing Shi; Lei Liu; | Int. J. Inf. Syst. Model. Des. | 2024-05-17 |
| 824 | Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we make the first attempt to model a full music piece under the realization of compositional hierarchy. |
Ziyu Wang; Lejun Min; Gus Xia; | arxiv-cs.SD | 2024-05-16 |
| 825 | MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces MVBind, an innovative Music-Video embedding space Binding model for cross-modal retrieval. |
Jiajie Teng; Huiyu Duan; Yucheng Zhu; Sijing Wu; Guangtao Zhai; | arxiv-cs.MM | 2024-05-15 |
| 826 | SMUG-Explain: A Framework for Symbolic Music Graph Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present Score MUsic Graph (SMUG)-Explain, a framework for generating and visualizing explanations of graph neural networks applied to arbitrary prediction tasks on musical scores. |
Emmanouil Karystinaios; Francesco Foscarin; Gerhard Widmer; | arxiv-cs.SD | 2024-05-15 |
| 827 | Dance Any Beat: Blending Beats with Visuals in Dance Video Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These methods also require precise keypoint annotations, complicating data collection and limiting the use of self-collected video datasets. To overcome these challenges, we introduce a novel task: generating dance videos directly from images of individuals guided by music. |
Xuanchen Wang; Heng Wang; Dongnan Liu; Weidong Cai; | arxiv-cs.CV | 2024-05-15 |
| 828 | Naturalistic Music Decoding from EEG Data Via Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, we explore the potential of using latent diffusion models, a family of powerful generative models, for the task of reconstructing naturalistic music from electroencephalogram (EEG) recordings. |
EMILIAN POSTOLACHE et. al. | arxiv-cs.SD | 2024-05-14 |
| 829 | Changing Your Tune: Lessons for Using Music to Encourage Physical Activity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Our research investigated whether music can communicate physical activity levels in daily life. Past studies have shown that simple musical tunes can provide wellness information, … |
Matthew Clark; Afsaneh Doryab; | Proceedings of the ACM on Interactive, Mobile, Wearable and … | 2024-05-13 |
| 830 | Modeling User Attention in Music Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the popularity of online music services, personalized music recommendation has garnered much research interest. Recommendation models are typically trained on datasets … |
SUNHAO DAI et. al. | 2024 IEEE 40th International Conference on Data Engineering … | 2024-05-13 |
| 831 | MARingBA: Music-Adaptive Ringtones for Blended Audio Notification Delivery Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Audio notifications provide users with an efficient way to access information beyond their current focus of attention. Current notification delivery methods, like phone ringtones, … |
Alexander Wang; Yi Fei Cheng; David Lindlbauer; | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
| 832 | Towards An Accessible and Rapidly Trainable Rhythm Sequencer Using A Generative Stacked Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes the integration of generative stacked autoencoder structures for rhythm generation, within a conventional melodic step-sequencer. |
Alex Wastnidge; | arxiv-cs.SD | 2024-05-11 |
| 833 | Co-designing The Collaborative Digital Musical Instruments for Group Music Therapy Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Digital Musical Instruments (DMIs) have been integrated into group music therapy, providing therapists with alternative ways to engage in musical dialogues with their clients. … |
YUAN-LING FENG et. al. | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
| 834 | Waves Push Me to Slumberland: Reducing Pre-Sleep Stress Through Spatio-Temporal Tactile Displaying of Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the fact that spatio-temporal patterns of vibration, characterized as rhythmic compositions of tactile content, have exhibited an ability to elicit specific emotional … |
HUI ZHANG et. al. | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
| 835 | A Way for Deaf and Hard of Hearing People to Enjoy Music By Exploring and Customizing Cross-modal Music Concepts IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deaf and hard of hearing (DHH) people enjoy music and access it using a music-sensory substitution system that delivers sound together with the corresponding visual and tactile … |
Youjin Choi; Junryeol Jeon; ChungHa Lee; Yeongeun Noh; Jin-Hyuk Hong; | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
| 836 | DoodleTunes: Interactive Visual Analysis of Music-Inspired Children Doodles with Automated Feature Annotation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music and visual arts are essential in children’s arts education, and their integration has garnered significant attention. Existing data analysis methods for exploring … |
SHUQI LIU et. al. | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
| 837 | Color Singer: Composing Music Via The Construction of LEGO Blocks with Various Colors Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Explore the fusion of creativity, play, and music with our project, where LEGO blocks transform abstract music composition into a tangible, intuitive experience. Each color in the … |
Qiuyu Lu; Zhang Yin; Jingtian Fu; Naixuan Du; Yingqing Xu; | Extended Abstracts of the CHI Conference on Human Factors … | 2024-05-11 |
| 838 | “Is Text-Based Music Search Enough to Satisfy Your Needs?” A New Way to Discover Music with Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music is intrinsically connected to human experience, yet the plethora of choices often renders the search for the ideal piece perplexing, especially when the search terms are … |
Jeongeun Park; Hyorim Shin; Changhoon Oh; Ha Young Kim; | Proceedings of the CHI Conference on Human Factors in … | 2024-05-11 |
| 839 | Music Corner – A Feasibility Study for Creating A Gesture-Based Rhythm Game for Music Education Inspired By Solfege Hand Signs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In Hong Kong, the gap in music education access between children from affluent and grassroots families is pronounced. To address this, we present Music Corner, a prototype of a … |
Fawad Ahmad Rana; Yuk Lam Tsang; Tak Wa Yip; | Extended Abstracts of the CHI Conference on Human Factors … | 2024-05-11 |
| 840 | Music Emotion Prediction Using Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study explores the application of recurrent neural networks to recognize emotions conveyed in music, aiming to enhance music recommendation systems and support therapeutic interventions by tailoring music to fit listeners’ emotional states. |
Xinyu Chang; Xiangyu Zhang; Haoruo Zhang; Yulu Ran; | arxiv-cs.SD | 2024-05-10 |
| 841 | Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel fine-tuning approach that prepends the rhyming word at the start of each lyric, which allows the critical rhyming decision to be made before the model commits to the content of the lyric (as during reverse language modeling), but maintains compatibility with the word order of regular PLMs as the lyric itself is still generated in left-to-right order. |
TOMMASO PASINI et. al. | arxiv-cs.CL | 2024-05-08 |
| 842 | Information Encoding in Computer Science Education Using The Cup Song Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Problem solving plays a central role in computer science classes, whereby the problems to be analyzed are often already available in encoded form. The initial process where the … |
Heike Buttke; Johannes Krugel; | 2024 IEEE Global Engineering Education Conference (EDUCON) | 2024-05-08 |
| 843 | Mozart’s Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, current models for image- and video-to-music synthesis struggle to capture the nuanced emotions and atmosphere conveyed by visual content. To fill this gap, we propose Mozart’s Touch, a multi-modal music generation framework capable of generating music aligned with cross-modal inputs such as images, videos, and text. |
Jiajun Li; Tianze Xu; Xuesong Chen; Xinrui Yao; Shuchang Liu; | arxiv-cs.SD | 2024-05-04 |
| 844 | DialectDecoder: Human/machine Teaming for Bird Song Classification and Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
BRITTANY STORY et. al. | Ecol. Informatics | 2024-05-01 |
| 845 | ComposerX: Multi-Agent Symbolic Music Composition with LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To further explore and enhance LLMs’ potential in music composition by leveraging their reasoning ability and the large knowledge base in music history and theory, we propose ComposerX, an agent-based symbolic music generation framework. |
QIXIN DENG et. al. | arxiv-cs.SD | 2024-04-28 |
| 846 | Introducing EEG Analyses to Help Personal Music Preference Prediction Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Nowadays, personalized recommender systems play an increasingly important role in music scenarios in our daily life with the preference prediction ability. However, existing … |
ZHIYU HE et. al. | ArXiv | 2024-04-24 |
| 847 | Music Style Transfer With Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The existing music style transfer methods generate spectrograms with artifacts, leading to significant noise in the generated audio. To address these issues, this study proposes a music style transfer framework based on diffusion models (DM) and uses spectrogram-based methods to achieve multi-to-multi music style transfer. |
Hong Huang; Yuyi Wang; Luyao Li; Jun Lin; | arxiv-cs.SD | 2024-04-23 |
| 848 | Musical Word Embedding for Music Tagging and Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in the domain of music, the word embedding may have difficulty understanding musical contexts or recognizing music-related entities like artists and tracks. To address this issue, we propose a new approach called Musical Word Embedding (MWE), which involves learning from various types of texts, including both everyday and music-related vocabulary. |
SeungHeon Doh; Jongpil Lee; Dasaem Jeong; Juhan Nam; | arxiv-cs.SD | 2024-04-21 |
| 849 | Music Consistency Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Nevertheless, the application of consistency models in music generation remains largely unexplored. To address this gap, we present Music Consistency Models (\texttt{MusicCM}), which leverages the concept of consistency models to efficiently synthesize mel-spectrogram for music clips, maintaining high quality while minimizing the number of sampling steps. |
Zhengcong Fei; Mingyuan Fan; Junshi Huang; | arxiv-cs.SD | 2024-04-20 |
| 850 | Track Role Prediction of Single-Instrumental Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a deep learning model designed to automatically predict the track-role of single-instrumental music sequences. |
Changheon Han; Suhyun Lee; Minsam Ko; | arxiv-cs.SD | 2024-04-20 |
| 851 | Large Language Models: From Notes to Musical Form Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Adapting a recent music generation model, this paper proposes a novel method to generate music with form. |
Lilac Atassi; | arxiv-cs.SD | 2024-04-18 |
| 852 | MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm. |
Jinwu Wang; Wei Mao; Miaomiao Liu; | arxiv-cs.SD | 2024-04-18 |
| 853 | Long-form Music Generation with Latent Diffusion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We show that by training a generative model on long temporal contexts it is possible to produce long-form music of up to 4m45s. |
ZACH EVANS et. al. | arxiv-cs.SD | 2024-04-16 |
| 854 | Violin Music Emotion Recognition with Fusion of CNN-BiGRU and Attention Mechanism Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music emotion recognition has garnered significant interest in recent years, as the emotions expressed through music can profoundly enhance our understanding of its deeper … |
Sihan Ma; Ruohua Zhou; | Inf. | 2024-04-16 |
| 855 | Ainur: Harmonizing Speed and Quality in Deep Music Generation Through Lyrics-Audio Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the domain of music generation, prevailing methods focus on text-to-music tasks, predominantly relying on diffusion models. |
G. Concialdi; A. Koudounas; E. Pastor; B. Di Eugenio; E. Baralis; | icassp | 2024-04-15 |
| 856 | Unified Pretraining Target Based Video-Music Retrieval with Music Rhythm and Video Optical Flow Information Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, our proposed approach leverages a unified target set to perform video/music pretraining and produces clip-level embeddings to preserve temporal information. |
T. Mao; S. Liu; Y. Zhang; D. Li; Y. Shan; | icassp | 2024-04-15 |
| 857 | MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics and Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MIR-MLPop, a publicly available multilingual pop music dataset designed for automatic lyrics transcription and lyrics alignment in polyphonic music. |
J. -Y. Wang; C. -C. Wang; C. -I. Leong; J. -S. R. Jang; | icassp | 2024-04-15 |
| 858 | Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces the HumTrans dataset, which is publicly available and primarily designed for humming melody transcription. |
S. Liu; X. Li; D. Li; Y. Shan; | icassp | 2024-04-15 |
| 859 | MusicLDM: Enhancing Novelty in Text-to-music Generation Using Beat-Synchronous Mixup Strategies IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Diffusion models have shown promising results in cross-modal generation tasks, including text-to-image and text-to-audio generation. |
K. CHEN et. al. | icassp | 2024-04-15 |
| 860 | Structure-Informed Positional Encoding for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. |
M. Agarwal; C. Wang; G. Richard; | icassp | 2024-04-15 |
| 861 | GPT-4 Driven Cinematic Music Generation Through Text Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents Herrmann-11, a multimodal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech processing models. |
M. T. Haseeb; A. Hammoudeh; G. Xia; | icassp | 2024-04-15 |
| 862 | Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a spatially-informed MSS method using a bridging band-split neural network that incorporates both spatial and spectral information. |
Y. YANG et. al. | icassp | 2024-04-15 |
| 863 | Enriching Music Descriptions with A Finetuned-LLM and Metadata for Text-to-Music Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, users also articulate a need to explore music that shares similarities with their favorite tracks or artists, such as I need a similar track to Superstition by Stevie Wonder. To address these concerns, this paper proposes an improved Text-to-Music Retrieval model, denoted as TTMR++, which utilizes rich text descriptions generated with a finetuned large language model and metadata. |
S. Doh; M. Lee; D. Jeong; J. Nam; | icassp | 2024-04-15 |
| 864 | SingFake: Singing Voice Deepfake Detection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose the singing voice deepfake detection task. |
Y. Zang; Y. Zhang; M. Heydari; Z. Duan; | icassp | 2024-04-15 |
| 865 | Adapting Frechet Audio Distance for Generative Music Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose reducing sample size bias by extrapolating scores towards an infinite sample size. |
A. Gui; H. Gamper; S. Braun; D. Emmanouilidou; | icassp | 2024-04-15 |
| 866 | STEMGEN: A Music Generation Model That Listens IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present an alternative paradigm for producing music generation models that can listen and respond to musical context. |
J. D. Parker; | icassp | 2024-04-15 |
| 867 | VRDMG: Vocal Restoration Via Diffusion Posterior Sampling with Multiple Guidance IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we identify that there are potential issues which will degrade current DPS-based methods’ performance and introduce the way to mitigate the issues inspired by diverse diffusion guidance techniques including the RePaint (RP) strategy and the Pseudoinverse-Guided Diffusion Models (ΠGDM). |
C. Hernandez-Olivan; | icassp | 2024-04-15 |
| 868 | Emotion-Aligned Contrastive Learning Between Images and Music Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we address the task of retrieving emotionally-relevant music from image queries by learning an affective alignment between images and music audio. |
S. Stewart; K. Avramidis; T. Feng; S. Narayanan; | icassp | 2024-04-15 |
| 869 | FSD: An Initial Chinese Dataset for Fake Song Detection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Thus, we employ the FSD dataset for the training of ADD models. We subsequently evaluate these models under two scenarios: one with the original songs and another with separated vocal tracks. |
Y. Xie; | icassp | 2024-04-15 |
| 870 | Music-to-Dance Poses: Learning to Retrieve Dance Poses from Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Choreography is an artful blend of technique and creativity, requiring the meticulous design of movement sequences in harmony with music. To support choreographers in this intricate task, this work proposes a music-to-dance pose retrieval system that uses music snippets to retrieve dance poses, predicts 3D human poses and shapes, and then matches them within the 3D pose and shape space. |
B. -W. Tseng; K. Yang; Y. -H. Hu; W. -L. Wei; J. -C. Lin; | icassp | 2024-04-15 |
| 871 | FDA-MIMO Radar Using Ambiguity Function for Target Two-Dimensional Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, an ambiguity function(AF)-based multiple signal classification (MUSIC) algorithm, namely, AF-MUSIC, is presented for the FDA and multiple-input multiple-output (MIMO) combined FDA-MIMO radar to address two-dimensional localization of moving targets. |
W. -Q. Wang; | icassp | 2024-04-15 |
| 872 | Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As the main contribution of this work, we propose enhancing control of multi-instrument synthesis by conditioning a generative model on a specific performance and recording environment, thus allowing for better guidance of timbre and style. |
B. Maman; J. Zeitler; M. Müller; A. H. Bermano; | icassp | 2024-04-15 |
| 873 | Generating Stereophonic Music with Single-Stage Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The recent success of audio language models (LMs) has revolutionized the field of neural music generation. Among all audio LM approaches, MusicGen has demonstrated the success of … |
X. Li; | icassp | 2024-04-15 |
| 874 | Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Text-to-music generation (T2M-Gen) faces a major obstacle due to the scarcity of large-scale publicly available music datasets with natural language captions. To address this, we propose the Music Understanding LLaMA (MU-LLaMA), capable of answering music-related questions and generating captions for music files. |
S. Liu; A. S. Hussain; C. Sun; Y. Shan; | icassp | 2024-04-15 |
| 875 | Investigating Personalization Methods in Text to Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we investigate the personalization of text-to-music diffusion models in a few-shot setting. |
M. Plitsis; T. Kouzelis; G. Paraskevopoulos; V. Katsouros; Y. Panagakis; | icassp | 2024-04-15 |
| 876 | A Scalable Sparse Transformer Model for Singing Melody Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a simple yet effective scalable sparse transformer for singing melody extraction. |
S. Yu; J. Liu; Y. Yu; W. Li; | icassp | 2024-04-15 |
| 877 | Joint Music and Language Attention Models for Zero-Shot Music Tagging IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a zero-shot music tagging system modeled by a joint music and language attention (JMLA) model to address the open-set music tagging problem. |
X. Du; Z. Yu; J. Lin; B. Zhu; Q. Kong; | icassp | 2024-04-15 |
| 878 | Stack-and-Delay: A New Codebook Pattern for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we compare different decoding strategies that aim to understand what codes can be decoded in parallel without penalizing the quality too much. |
G. Le Lan; | icassp | 2024-04-15 |
| 879 | ESVC: Combining Adaptive Style Fusion and Multi-Level Feature Disentanglement for Expressive Singing Voice Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an expressive SVC framework called ESVC, which can convert singer identity and emotional style simultaneously. |
Z. Yang; | icassp | 2024-04-15 |
| 880 | Synthia’s Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We, in part, attribute this to the lack of an appropriate benchmark dataset. To address this gap, we present Synthia’s melody, a novel audio data generation framework capable of simulating an infinite variety of 4-second melodies with user-specified confounding structures characterised by musical keys, timbre, and loudness. |
C. -H. Lin; C. Jones; B. W. Schuller; H. Coppock; A. Akman; | icassp | 2024-04-15 |
| 881 | Music Auto-Tagging with Robust Music Representation Learned Via Domain Adversarial Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes a method inspired by speech-related tasks to enhance music auto-tagging performance in noisy settings. |
H. Joung; K. Lee; | icassp | 2024-04-15 |
| 882 | MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics and Audio Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We introduce MIR-MLPop, a publicly available multilingual pop music dataset designed for automatic lyrics transcription and lyrics alignment in polyphonic music. The dataset … |
J. Wang; Chung-Che Wang; Chon-In Leong; Jyh-Shing Roger Jang; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
| 883 | Girls Rocking The Code: Gender-dependent Stereotypes, Engagement & Comprehension in Music Programming Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: One of the greatest challenges in early programming education is to achieve learning success while also creating initial interest. This is particularly difficult for girls, who … |
Isabella Graßl; Gordon Fraser; | 2024 IEEE/ACM 46th International Conference on Software … | 2024-04-14 |
| 884 | A Scalable Sparse Transformer Model for Singing Melody Extraction Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Extracting the melody of a singing voice is an essential task within the realm of music information retrieval (MIR). Recently, transformer based models have drawn great attention … |
Shuai Yu; Jun Liu; Yi Yu; Wei Li; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
| 885 | GPT-4 Driven Cinematic Music Generation Through Text Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents Herrmann-11, a multimodal framework to generate background music tailored to movie scenes, by integrating state-of-the-art vision, language, music, and speech … |
Muhammad Taimoor Haseeb; Ahmad Hammoudeh; Gus G. Xia; | ICASSP 2024 – 2024 IEEE International Conference on … | 2024-04-14 |
| 886 | Context-Aware Music Recommendation Algorithm Combining Classification and Collaborative Filtering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As an effective solution to the problem of information overload, personalized recommendations have received widespread attention in the music field. A context-aware music … |
Xiaoling Wu; Guodong Sun; | Scalable Comput. Pract. Exp. | 2024-04-12 |
| 887 | SingDistVis: Interactive Overview+Detail Visualization for F0 Trajectories of Numerous Singers Singing The Same Song Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper describes SingDistVis, an information visualization technique for fundamental frequency (F0) trajectories of large-scale singing data where numerous singers sing the … |
Takayuki Itoh; Tomoyasu Nakano; Satoru Fukayama; Masahiro Hamasaki; Masataka Goto; | Multim. Tools Appl. | 2024-04-10 |
| 888 | MuPT: A Generative Symbolic Music Pretrained Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. |
XINGWEI QU et. al. | arxiv-cs.SD | 2024-04-09 |
| 889 | Exploring Diverse Sounds: Identifying Outliers in A Music Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore music outliers, investigating their potential usefulness for music discovery and recommendation systems. |
Le Cai; Sam Ferguson; Gengfa Fang; Hani Alshamrani; | arxiv-cs.SD | 2024-04-09 |
| 890 | An Improved Intelligent Machine Learning Approach to Music Recommendation Based on Big Data Techniques and DSO Algorithms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: INTRODUCTION: In an effort to enhance the quality of user experience in using music services and improve the efficiency of music recommendation platforms, researching accurate and … |
Sujie He; Yuxian Li; | EAI Endorsed Trans. Scalable Inf. Syst. | 2024-04-08 |
| 891 | The NES Video-Music Database: A Dataset of Symbolic Video Game Music Paired with Gameplay Videos Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Neural models are one of the most popular approaches for music generation, yet there aren’t standard large datasets tailored for learning music directly from game data. To address this research gap, we introduce a novel dataset named NES-VMDB, containing 98,940 gameplay videos from 389 NES games, each paired with its original soundtrack in symbolic format (MIDI). |
Igor Cardoso; Rubens O. Moraes; Lucas N. Ferreira; | arxiv-cs.SD | 2024-04-05 |
| 892 | A Computational Analysis of Lyric Similarity Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, many of these systems do not fully consider human perceptions of lyric similarity, primarily due to limited research in this area. To bridge this gap, we conducted a comparative analysis of computational methods for modeling lyric similarity with human perception. |
Haven Kim; Taketo Akama; | arxiv-cs.CL | 2024-04-02 |
| 893 | Practical End-to-End Optical Music Recognition for Pianoform Music IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: (b) We create a dev and test set for benchmarking typeset OMR with MusicXML ground truth based on the OpenScore Lieder corpus. |
Jiří Mayer; Milan Straka; Jan Hajič jr.; Pavel Pecina; | arxiv-cs.CV | 2024-03-20 |
| 894 | Adapting Technology for Dementia Care: The Case of Emobook App in Reminiscence Focused Music Therapy Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: INTRODUCTION: Life Story books are frequently employed to facilitate reminiscence interventions, but their use in Music Therapy remains limited in the scientific literature. There … |
Noelia Gerbaudo-González; Alejandro Catalá; Nelly Condori-Fernández; Manuel Gandoy-Crego; | EAI Endorsed Trans. Creative Technol. | 2024-03-19 |
| 895 | Algorithmic Collective Action in Recommender Systems: Promoting Songs By Reordering Playlists Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce two easily implementable strategies to select the position at which to insert the song with the goal to boost recommendations at test time. |
Joachim Baumann; Celestine Mendler-Dünner; | arxiv-cs.IR | 2024-03-19 |
| 896 | Using Multimodal Learning Analytics to Examine Learners’ Responses to Different Types of Background Music During Reading Comprehension Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Previous studies have evaluated the affordances and challenges of performing cognitively demanding learning tasks with background music (BGM), yet the effects of various types of … |
Ying Que; J. T. D. Ng; Xiao Hu; Mitchell Kam Fai Mak; Peony Tsz Yan Yip; | Proceedings of the 14th Learning Analytics and Knowledge … | 2024-03-18 |
| 897 | Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose Prompt-Singer, the first SVS method that enables attribute controlling on singer gender, vocal range and volume with natural language. |
YONGQI WANG et. al. | arxiv-cs.SD | 2024-03-18 |
| 898 | QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Quaternion-Enhanced Attention Network (QEAN) for visual dance synthesis from a quaternion perspective, which consists of a Spin Position Embedding (SPE) module and a Quaternion Rotary Attention (QRA) module. |
ZHIZHEN ZHOU et. al. | arxiv-cs.GR | 2024-03-18 |
| 899 | Developing Computational Thinking in Middle School Music Technology Classrooms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To engage diverse populations of students who may not self-select into computing courses, a curriculum for a middle school music technology + computer science course that … |
L. MCCALL et. al. | Proceedings of the 55th ACM Technical Symposium on Computer … | 2024-03-14 |
| 900 | Automatic Identification of Preferred Music Genres: An Exploratory Machine Learning Approach to Support Personalized Music Therapy Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
INGRID BRUNO NUNES et. al. | Multim. Tools Appl. | 2024-03-14 |
| 901 | Influencing Factors and Modeling Methods of Vocal Music Teaching Quality Supported By Artificial Intelligence Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In order to explore the maturity of online concerts and the digital content of music resources, this article analyzes the role of artificial intelligence in music education, … |
Yang Yuan; | Int. J. Web Based Learn. Teach. Technol. | 2024-03-13 |
| 902 | Application-Oriented Talents Training for Music Majors in Colleges and Universities Based on Internet Remote Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper mainly studies the cultivation of applied talents of music majors in colleges and universities based on internet remote technology. By analyzing the definition and … |
Lin Shui; Yuan Feng; Mengting Zhong; Yuanhui Qin; | Int. J. Web Based Learn. Teach. Technol. | 2024-03-12 |
| 903 | Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automatic lyric transcription (ALT) refers to transcribing singing voices into lyrics, while automatic music transcription (AMT) refers to transcribing singing voices into note … |
XIANGMING GU et. al. | ACM Transactions on Multimedia Computing, Communications … | 2024-03-12 |
| 904 | SiTunes: A Situational Music Recommendation Dataset with Physiological and Psychological Signals Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With an increasing number of music tracks available online, music recommender systems have become popular and ubiquitous. Previous research indicates that people’s preferences, … |
VADIM GRIGOREV et. al. | Proceedings of the 2024 Conference on Human Information … | 2024-03-10 |
| 905 | Social Media Analytics for Academic Music Library: A Case Study of CUHK Center for Chinese Music Studies Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: PurposeWith the rapid development of social media, many organizations have begun to attach importance to social media platforms. This research studies the management and the use … |
BING XUE et. al. | Libr. Hi Tech | 2024-03-08 |
| 906 | Edge-cloud Computing Oriented Large-scale Online Music Education Mechanism Driven By Neural Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the advent of the big data era, edge cloud computing has developed rapidly. In this era of popular digital music, various technologies have brought great convenience to … |
Wen Xing; Adam Slowik; J. D. Peter; | J. Cloud Comput. | 2024-03-07 |
| 907 | Deconfounded Cross-modal Matching for Content-based Micro-video Background Music Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Object-oriented micro-video background music recommendation is a complicated task where the matching degree between videos and background music is a major issue. However, music … |
Jing Yi; Zhenzhong Chen; | ACM Transactions on Intelligent Systems and Technology | 2024-03-06 |
| 908 | Can Audio Reveal Music Performance Difficulty? Insights from The Piano Syllabus Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automatically estimating the performance difficulty of a music piece represents a key process in music education to create tailored curricula according to the individual needs of … |
Pedro Ramoneda; Minhee Lee; Dasaem Jeong; J. J. Valero-Mas; Xavier Serra; | arxiv-cs.SD | 2024-03-06 |
| 909 | Interactive Melody Generation System for Enhancing The Creativity of Musicians Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes a system designed to enumerate the process of collaborative composition among humans, using automatic music composition technology. |
So Hirawata; Noriko Otani; | arxiv-cs.SD | 2024-03-05 |
| 910 | A Deep Learning Model of Dance Generation for Young Children Based on Music Rhythm and Beat Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the development of technology, research related to music‐based dance generation models has been increasing. Some of the studies have applied algorithms to dance generation … |
Shanshan Kong; | Concurrency and Computation: Practice and Experience | 2024-03-04 |
| 911 | Optimized Multiscale Deep Bidirectional Gated Recurrent Neural Network Fostered Practical Teaching of University Music Course Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music education has a rich historical background. Nevertheless, the introduction of modern teaching methods is relatively delayed. In recent years, there has been a remarkable … |
Yuanyuan Hu; | J. Intell. Fuzzy Syst. | 2024-02-28 |
| 912 | Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Such models, possibly originally developed for text and adapted for symbolic music, are trained on various tasks. We describe these models, in particular deep learning models, through different prisms, highlighting music-specialized mechanisms. |
Dinh-Viet-Toan Le; Louis Bigo; Mikaela Keller; Dorien Herremans; | arxiv-cs.IR | 2024-02-27 |
| 913 | Classical Music Education in China: The Effectiveness of The WeChat Social Media Platform and Its Impact on The Communicative and Cognitive Skills of Music Students Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Tao Chen; | Educ. Inf. Technol. | 2024-02-27 |
| 914 | Singer Identification Model Using Data Augmentation and Enhanced Feature Conversion with Hybrid Feature Vector and Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Serhat Hizlisoy; R. Arslan; E. Çolakoğlu; | EURASIP J. Audio Speech Music. Process. | 2024-02-26 |
| 915 | ChatMusician: Understanding and Generating Music Intrinsically with LLM IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce ChatMusician, an open-source LLM that integrates intrinsic musical abilities. |
RUIBIN YUAN et. al. | arxiv-cs.SD | 2024-02-25 |
| 916 | ByteComposer: A Human-like Melody Composition Method Based on Language Model Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, how to design a human-aligned and interpretable melody composition system is still under-explored. To solve this problem, we propose ByteComposer, an agent framework emulating a human’s creative pipeline in four separate steps : Conception Analysis – Draft Composition – Self-Evaluation and Modification – Aesthetic Selection. |
XIA LIANG et. al. | arxiv-cs.SD | 2024-02-23 |
| 917 | A Survey of Music Generation in The Context of Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a thorough review of music representation, feature analysis, heuristic algorithms, statistical and parametric modelling, and human and automatic evaluation measures, along with a discussion of which approaches and models seem most suitable for live interaction. |
ISMAEL AGCHAR et. al. | arxiv-cs.SD | 2024-02-23 |
| 918 | Understanding Human-AI Collaboration in Music Therapy Through Co-Design with Therapists IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We presented the co-design outcomes involving the integration of musical AIs into a music therapy process, which was developed from a theoretical framework rooted in emotion-focused therapy. |
Jingjing Sun; Jingyi Yang; Guyue Zhou; Yucheng Jin; Jiangtao Gong; | arxiv-cs.HC | 2024-02-22 |
| 919 | Below 58 BPM, Involving Real-time Monitoring and Self-medication Practices in Music Performance Through IoT Technology Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The project presented in this paper illustrates the design process for the development of an IoT system that monitors a specific bio-metric parameter (heart rate) in real time and … |
Nicolò Merendino; Antonio Rodà; Raul Masu; | Frontiers Comput. Sci. | 2024-02-21 |
| 920 | Soundtrack Success: Unveiling Song Popularity Patterns Using Machine Learning Implementation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Shruti Arora; Rinkle Rani; | SN Comput. Sci. | 2024-02-21 |
| 921 | Music Style Transfer with Time-Varying Inversion of Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents a music style transfer approach that effectively captures musical attributes using minimal data. |
SIFEI LI et. al. | arxiv-cs.SD | 2024-02-21 |
| 922 | MCSSME: Multi-Task Contrastive Learning for Semi-supervised Singing Melody Extraction from Polyphonic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, to deal with data scarcity limitation, we propose a self-consistency regularization (SCR) method to train the model on the unlabeled data. |
Shuai Yu; | aaai | 2024-02-20 |
| 923 | N-gram Unsupervised Compoundation and Feature Injection for Better Symbolic Music Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a novel method, NG-Midiformer, for understanding symbolic music sequences that leverages the N-gram approach. |
Jinhao Tian; Zuchao Li; Jiajia Li; Ping Wang; | aaai | 2024-02-20 |
| 924 | MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, prior research on deep learning-based emotional music generation has rarely explored the contribution of different musical elements to emotions, let alone the deliberate manipulation of these elements to alter the emotion of music, which is not conducive to fine-grained element-level control over emotions. To address this gap, we present a novel approach employing musical element-based regularization in the latent space to disentangle distinct elements, investigate their roles in distinguishing emotions, and further manipulate elements to alter musical emotions. |
Shulei Ji; Xinyu Yang; | aaai | 2024-02-20 |
| 925 | Does AI-assisted Creation of Polyphonic Music Increase Academic Motivation? The DeepBach Graphical Model and Its Use in Music Education IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the modern music industry, AI music generators have gained particular importance. The use of AI greatly simplifies the creation of polyphony. In addition, it can increase … |
Na Yuan; | J. Comput. Assist. Learn. | 2024-02-20 |
| 926 | Structure-informed Positional Encoding for Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. |
Manvi Agarwal; Changhong Wang; Gaël Richard; | arxiv-cs.SD | 2024-02-20 |
| 927 | Responding to The Call: Exploring Automatic Music Composition Using A Knowledge-Enhanced Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we train the composition module using the call-response pairs, supplementing it with musical knowledge in terms of rhythm, melody, and harmony. |
ZHEJING HU et. al. | aaai | 2024-02-20 |
| 928 | V2Meow: Meowing to The Visual Beat Via Video-to-Music Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose V2Meow, a video-to-music generation system capable of producing high-quality music audio for a diverse range of video input types using a multi-stage autoregressive model. |
KUN SU et. al. | aaai | 2024-02-20 |
| 929 | DeepSRGM — Sequence Classification and Ranking in Indian Classical Music with Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a deep learning based approach to Raga recognition. |
Sathwik Tejaswi Madhusudhan; Girish Chowdhary; | arxiv-cs.SD | 2024-02-15 |
| 930 | Phantom in The Opera: Adversarial Music Attack for Robot Dialogue System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study explores the vulnerability of robot dialogue systems’ automatic speech recognition (ASR) module to adversarial music attacks. Specifically, we explore music as a … |
Sheng Li; Jiyi Li; Yang Cao; | Frontiers Comput. Sci. | 2024-02-15 |
| 931 | DeepSRGM – Sequence Classification and Ranking in Indian Classical Music Via Deep Learning IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a melodic framework for compositions and improvisations alike. Raga Recognition is an important music … |
S. Madhusudhan; Girish V. Chowdhary; | ArXiv | 2024-02-15 |
| 932 | Application of Recommendation Algorithms Based on Social Relationships and Behavioral Characteristics in Music Online Teaching Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research designed an improved collaborative filtering algorithm to be responsible for music recommendation tasks in the online music teaching platform. This algorithm … |
Chunjing Yin; | Int. J. Web Based Learn. Teach. Technol. | 2024-02-14 |
| 933 | An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Computational aesthetic evaluation has made remarkable contribution to visual art works, but its application to music is still rare. |
Xin Jin; Wu Zhou; Jingyu Wang; Duo Xu; Yongsen Zheng; | arxiv-cs.CV | 2024-02-13 |
| 934 | Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents the Sheet Music Transformer, the first end-to-end OMR model designed to transcribe complex musical scores without relying solely on monophonic strategies. |
Antonio Ríos-Vila; Jorge Calvo-Zaragoza; Thierry Paquet; | arxiv-cs.CV | 2024-02-12 |
| 935 | RaveNET: Connecting People and Exploring Liminal Space Through Wearable Networks in Music Performance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: RaveNET connects people to music, enabling musicians to modulate sound using signals produced by their own bodies or the bodies of others. We present three wearable prototype … |
Rachel Freire; Valentin Martinez-Missir; Courtney N. Reed; P. Strohmeier; | Proceedings of the Eighteenth International Conference on … | 2024-02-11 |
| 936 | Evaluating Co-Creativity Using Total Information Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a method to compute the information flow using pre-trained generative models as entropy estimators. |
Vignesh Gokul; Chris Francis; Shlomo Dubnov; | arxiv-cs.SD | 2024-02-09 |
| 937 | MusicMagus: Zero-Shot Text-to-Music Editing Via Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, music generation usually involves iterative refinements, and how to edit the generated music remains a significant challenge. This paper introduces a novel approach to the editing of music generated by such models, enabling the modification of specific attributes, such as genre, mood and instrument, while maintaining other aspects unchanged. |
YIXIAO ZHANG et. al. | arxiv-cs.SD | 2024-02-08 |
| 938 | Hierarchical Multi-head Attention LSTM for Polyphonic Symbolic Melody Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ahmet Kaşif; Selçuk Sevgen; Alper Ozcan; C. Catal; | Multim. Tools Appl. | 2024-02-08 |
| 939 | Towards Feature-based Versioning for Musicological Research Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper discusses the management of revisions and variants of musical works for the context of musicological research. Domain-specific languages (DSLs) are a fundamental tool … |
P. Grünbacher; Markus Neuwirth; | Proceedings of the 18th International Working Conference on … | 2024-02-07 |
| 940 | Situating Language and Music Research in A Domain-specific Versus Domain-general Framework: A Review of Theoretical and Empirical Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: While many theoretical proposals about the relationship between language and music processing have been proposed over the past 40 years, recent empirical advances have shed new … |
Katerina Drakoulaki; Christina Anagnostopoulou; M. Guasti; Barbara Tillmann; S. Varlokosta; | Lang. Linguistics Compass | 2024-02-02 |
| 941 | Melody Generation Based on Deep Ensemble Learning Using Varying Temporal Context Length Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Baibhav Nag; Asif Iqbal Middya; Sarbani Roy; | Multim. Tools Appl. | 2024-02-02 |
| 942 | Music Style Classification By Jointly Using CNN and Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music influences people in many ways and plays an important role in human life from emotional expression to social interaction to cognitive development. However, the variety of … |
Rui Tang; Miao Qi; Nanqing Wang; | Proceedings of the 2024 16th International Conference on … | 2024-02-02 |
| 943 | Everyday Uses of Music Listening and Music Technologies By Caregivers and People with Dementia: Survey and Focus Group Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To ensure music technologies are appropriately designed for supporting caregivers and people living with dementia, there remains a need to better understand how music is currently used in everyday care at home. We aimed to understand how people with dementia and their caregivers use music technologies in everyday caring, as well as challenges they experience using music and technology. |
DIANNA VIDAS et. al. | arxiv-cs.HC | 2024-02-01 |
| 944 | Application of Big Data Technology in College Music Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Based on information technology to promote the upgrading of various industries, the core of Industry 4.0, the current internet big data has gradually become a hot spot of social … |
Jiang Bian; Tao Yang; | Int. J. Web Based Learn. Teach. Technol. | 2024-01-31 |
| 945 | SongBsAb: A Dual Prevention Approach Against Singing Voice Conversion Based Illegal Song Covers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose SongBsAb, the first proactive approach to tackle SVC-based illegal song covers. |
GUANGKE CHEN et. al. | arxiv-cs.SD | 2024-01-30 |
| 946 | Music Auto-Tagging with Robust Music Representation Learned Via Domain Adversarial Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes a method inspired by speech-related tasks to enhance music auto-tagging performance in noisy settings. |
Haesun Joung; Kyogu Lee; | arxiv-cs.SD | 2024-01-27 |
| 947 | MoodLoopGP: Generating Emotion-Conditioned Loop Tablature Music with Multi-Granular Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, building upon LooperGP, a loopable tablature generation model, this paper explores endowing systems with control over conveyed emotions. To enable such conditional generation, we propose integrating musical knowledge by utilizing multi-granular semantic and musical features during model training and inference. |
Wenqian Cui; Pedro Sarmento; Mathieu Barthet; | arxiv-cs.SD | 2024-01-23 |
| 948 | MELODY: Robust Semi-Supervised Hybrid Model for Entity-Level Online Anomaly Detection with Multivariate Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study the problem of anomaly detection for deployments. |
Jingchao Ni; Gauthier Guinet; Peihong Jiang; Laurent Callot; Andrey Kan; | arxiv-cs.LG | 2024-01-18 |
| 949 | Exploring The Diversity of Music Experiences for Deaf and Hard of Hearing People Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Sensory substitution or enhancement techniques have been proposed to enable deaf or hard of hearing (DHH) people to listen to and even compose music. |
Kyrie Zhixuan Zhou; Weirui Peng; Yuhan Liu; Rachel F. Adler; | arxiv-cs.HC | 2024-01-17 |
| 950 | Emotional Behavior Analysis of Music Course Evaluation Based on Online Comment Mining Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study investigates the method of analyzing emotional tendencies in music courses and its application in lesson plan evaluation. Using a weighted method to analyze emotional … |
Nan Li; | Int. J. Inf. Technol. Web Eng. | 2024-01-17 |
| 951 | Link Me Baby One More Time: Social Music Discovery on Spotify Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We explore the social and contextual factors that influence the outcome of person-to-person music recommendations and discovery. |
Shazia’Ayn Babul; Desislava Hristova; Antonio Lima; Renaud Lambiotte; Mariano Beguerisse-Díaz; | arxiv-cs.SI | 2024-01-16 |
| 952 | ScripTONES: Sentiment-Conditioned Music Generation for Movie Scripts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a two-stage pipeline for generating music from a movie script. |
Vishruth Veerendranath; Vibha Masti; Utkarsh Gupta; Hrishit Chaudhuri; Gowri Srinivasa; | arxiv-cs.MM | 2024-01-13 |
| 953 | Singer Identity Representation Learning Using Self-Supervised Techniques IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, the same level of progress has not been achieved for singing voices. To bridge this gap, we suggest a framework for training singer identity encoders to extract representations suitable for various singing-related tasks, such as singing voice similarity and synthesis. |
Bernardo Torres; Stefan Lattner; Gaël Richard; | arxiv-cs.SD | 2024-01-10 |
| 954 | On Using Artificial Intelligence to Predict Music Playlist Success Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The emergence of digital music platforms has fundamentally transformed the way we engage with and organize music. As playlist creation has gained widespread popularity, there is … |
Roberto Cavicchioli; J. Hu; Marco Furini; | 2024 IEEE 21st Consumer Communications & Networking … | 2024-01-06 |
| 955 | MusicAOG: An Energy-Based Model for Learning and Sampling A Hierarchical Representation of Symbolic Music Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In addressing the challenge of interpretability and generalizability of artificial music intelligence, this paper introduces a novel symbolic representation that amalgamates both explicit and implicit musical information across diverse traditions and granularities. |
YIKAI QIAN et. al. | arxiv-cs.SD | 2024-01-05 |
| 956 | Integrating Repeat Listening Patterns for Enhanced Music Recommendation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music is a form of content that is often repetitively consumed. Determining the frequency at which to recommend previously heard tracks to users is a critical issue. In recent … |
Ryunosuke Shigetomi; Hiroko Nishida; Ken-ichi Sawai; Taketoshi Ushiama; | 2024 18th International Conference on Ubiquitous … | 2024-01-03 |
| 957 | Assessing The Effects of Music on Anxiety and Performance in Tasks Using Heart Rate Variability and Mean Arterial Pressure Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Anxiety disorder affects 264 million people worldwide. According to the Anxiety and Depression Association of America, 31% of the population experiences anxiety problems at some … |
M. NITHYA MYLAKUMAR et. al. | IEEE Access | 2024-01-01 |
| 958 | Expressing and Developing Melodic Phrases in Gamelan Skeletal Melody Generation Using Genetic Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A novel approach was proposed to develop a composition generation system for creating note sequences representing question-and-answer phrases (QAPs) in melodic phrases. Using a … |
A. Z. Fanani; Arry Maulana Syarif; Guruh Fajar Shidik; Aris Marjuni; | IEEE Access | 2024-01-01 |
| 959 | Identifying Irish Traditional Music Genres Using Latent Audio Representations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Irish traditional music contains a variety of features that help to understand the cultural idiosyncrasies of Irish people. One of the most prominent aspects is genre, which … |
Diego M. Jiménez-Bravo; Álvaro Lozano Murciego; Juan José Navarro-Cáceres; María Navarro-Cáceres; Treasa Harkin; | IEEE Access | 2024-01-01 |
| 960 | Full-Page Music Symbols Recognition: State-of-the-Art Deep Model Comparison for Handwritten and Printed Music Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ali Yesilkanat; Yann Soullard; Bertrand Coüasnon; Nathalie Girard; | International Workshop on Document Analysis Systems | 2024-01-01 |
| 961 | Improved Harmonic Spectral Envelope Extraction for Singer Classification with Hybridised Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Balachandra Kumaraswamy; | Int. J. Bio Inspired Comput. | 2024-01-01 |
| 962 | Time-Delay Estimation Based on An Enhanced Modified MUSIC With Co-Prime Frequency Sampling for Rough Pavement Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In cases involving a rough interface, the echo frequency behavior of ultrawideband ground-penetrating radar (UWB-GPR) approximates a nonlinear Gaussian function. This … |
BIYUN MA et. al. | IEEE Geoscience and Remote Sensing Letters | 2024-01-01 |
| 963 | Controllable Syllable-Level Lyrics Generation From Melody With Prior Attention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Melody-to-lyrics generation, which is based on syllable-level generation, is an intriguing and challenging topic in the interdisciplinary field of music, multimedia, and machine … |
Zhe Zhang; Yi Yu; Atsuhiro Takasu; | IEEE Transactions on Multimedia | 2024-01-01 |
| 964 | HAISP: A Dataset of Human-AI Songwriting Processes From The AI Song Contest Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
LIDIA MORRIS et. al. | International Society for Music Information Retrieval … | 2024-01-01 |
| 965 | Copyright and The Production of Hip Hop Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Whereas the role of patents in cumulative innovation has been well established, little work has examined the impact that copyright policy may have on cumulative innovation in … |
J. Watson; | SSRN Electronic Journal | 2024-01-01 |
| 966 | CQTXNet: A Modified Xception Network with Attention Modules for Cover Song Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jinsoo Seo; Junghyun Kim; Hyemi Kim; | IEICE Trans. Inf. Syst. | 2024-01-01 |
| 967 | LSTM-MorA: Melody-Accompaniment Classification of MIDI Tracks Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Hui Liu; Leon Flaack; Shiyao Zhang; Tanja Schultz; | International Conference on Artificial Neural Networks | 2024-01-01 |
| 968 | METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
Dinh-Viet-Toan Le; Yi-Hsuan Yang; | ArXiv | 2024-01-01 |
| 969 | MGU-V: A Deep Learning Approach for Lo-Fi Music Generation Using Variational Autoencoders With State-of-the-Art Performance on Combined MIDI Datasets Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music generation presents a significant challenge within the realm of generative AI, encompassing diverse applications in music production, real-time composition, and other … |
Amit Kumar Bairwa; Siddhanth Bhat; Tanishk Sawant; R. Manoj; | IEEE Access | 2024-01-01 |
| 970 | Beyond The Trends: Evolution and Future Directions in Music Recommender Systems Research IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The study of Music Recommender Systems (MRS) has become crucial in digital music consumption, influencing how people discover and interact with music. This comprehensive analysis … |
Babak Amiri; Nikan Shahverdi; Amirali Haddadi; Yalda Ghahremani; | IEEE Access | 2024-01-01 |
| 971 | MusicECAN: An Automatic Denoising Network for Music Recordings With Efficient Channel Attention Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this work, we address the long-standing problem of automatic recorded music denoising. In previous audio denoising research, the primary focus has been on speech, and music … |
Haonan Cheng; Shulin Liu; Zhicheng Lian; Long Ye; Qin Zhang; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
| 972 | Multi-Layer Combined Frequency and Periodicity Representations for Multi-Pitch Estimation of Multi-Instrument Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multi-pitch estimation (MPE) is one of the most important tasks in automatic music transcription (AMT). Since music generally involves a wide variety of instruments, MPE should be … |
Tomoki Matsunaga; Hiroaki Saito; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
| 973 | Moûsai: Efficient Text-to-Music Diffusion Models IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent years have seen the rapid development 001 of large generative models for text; however, 002 much less research has explored the connection 003 between text and another … |
Flavio Schneider; Ojasv Kamal; Zhijing Jin; Bernhard Schölkopf; | Annual Meeting of the Association for Computational … | 2024-01-01 |
| 974 | Cross-Modal Interaction Via Reinforcement Feedback for Audio-Lyrics Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The task of retrieving audio content relevant to lyric queries and vice versa plays a critical role in music-oriented applications. In this process, robust feature representations … |
Dong Zhou; Fang Lei; Lin Li; Yongmei Zhou; Aimin Yang; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
| 975 | DanceComposer: Dance-to-Music Generation Using A Progressive Conditional Music Generator Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: A wonderful piece of music is the essence and soul of dance, which motivates the study of automatic music generation for dance. To create appropriate music from dance, cross-modal … |
Xiao Liang; Wensheng Li; Lifeng Huang; Chengying Gao; | IEEE Transactions on Multimedia | 2024-01-01 |
| 976 | Music-Driven Synchronous Dance Generation Considering K-Pop Musical and Choreographical Characteristics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Generating dance movements from music has been considered a highly challenging task, as it requires the model to comprehend concepts from two different modalities: audio and … |
Seohyun Kim; Kyogu Lee; | IEEE Access | 2024-01-01 |
| 977 | The Implementation of A Proposed Deep-learning Algorithm to Classify Music Genres Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To improve the classification effect of music genres in the digital music era, the article employs deep-learning algorithms to improve the performance of the classification of … |
Lili Liu; | Open Comput. Sci. | 2024-01-01 |
| 978 | The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the … |
Rong Jiang; Xiaofei Mou; | IEEE Access | 2024-01-01 |
| 979 | Continuous Emotion-Based Image-to-Music Generation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Image-to-music generation aims to generate realistic pure music according to a given image. Although many previous works are conducted on bridging image and music, they mainly … |
Yajie Wang; Mulin Chen; Xuelong Li; | IEEE Transactions on Multimedia | 2024-01-01 |
| 980 | A Computationally Light MUSIC Based Algorithm for Automotive RADARs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this paper, a computationally light single-snapshot multiple signal classification (MUSIC) algorithm is presented for multidimensional estimation in the framework of automotive … |
M. A. Maisto; A. Dell’Aversano; Adriana Brancaccio; Ivan Russo; Raffaele Solimene; | IEEE Transactions on Computational Imaging | 2024-01-01 |
| 981 | Enhancing Running Exercise With IoT, Blockchain, and Heart Rate Adaptive Running Music Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research proposes a music-adaptive IoT exercise system integrating edge computing and blockchain technology to enhance middle-distance running exercise. The system … |
Yi Chen; Chung-Chiang Chen; Li-Chuan Tang; W. Chieng; | IEEE Access | 2024-01-01 |
| 982 | Machine Learning and Data Analysis for Word Segmentation of Classical Chinese Poems: Illustrations with Tang and Song Examples Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chao-Lin Liu; Wei-Ting Chang; Chang-Ting Chu; Tianfu Zheng; | Digit. Scholarsh. Humanit. | 2024-01-01 |
| 983 | SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
SHUANGRUI DING et. al. | ArXiv | 2024-01-01 |
| 984 | CPTGZ: Generating Chinese Guzheng Music From Chinese Paintings Based on Diffusion Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the context of rapid advancements in artificial intelligence technology, AI-powered music composition has demonstrated remarkable creative capabilities. However, no existing … |
Enji Zhao; Jiaxiang Zheng; Moxi Cao; | IEEE Access | 2024-01-01 |
| 985 | Music Genre Classification Based on Functional Data Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Music genre classification (MGC) has gained significant attention due to its broad applications in music information retrieval. Traditional MGC approaches often rely on … |
Jiahong Shen; Guangrun Xiao; | IEEE Access | 2024-01-01 |
| 986 | Enhancing Vocal Melody Extraction With Multilevel Contexts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper proposes a convolutional recurrent neural network model for vocal melody extraction. It is deeper and wider than existing models in terms of the following aspects. … |
Xian Wang; | IEEE Signal Processing Letters | 2024-01-01 |
| 987 | The Usage of Artificial Intelligence Technology in Music Education System Under Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This work addresses the needs of the music generation field by developing a music generation system based on an advanced Transformer model. The system incorporates an adaptive … |
Yinchi Chen; Yan Sun; | IEEE Access | 2024-01-01 |
| 988 | Toward An Inclusive Framework for Remote Musical Education and Practices Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper provides an overview on inclusiveness in remote music education and networked music performances, highlighting issues and difficulties encountered by users with visual, … |
Cristina Rottondi; Matteo Sacchetto; Leonardo Severi; Andrea Bianco; | IEEE Access | 2024-01-01 |
| 989 | Music Emotion Recognition Based on Deep Learning: A Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, with the development of the digital era, music emotion recognition technology has been widely used in the fields of music recommendation system, music … |
Xingguo Jiang; Yuchao Zhang; Guojun Lin; Ling Yu; | IEEE Access | 2024-01-01 |
| 990 | The Beauty of Repetition: An Algorithmic Composition Model With Motif-Level Repetition Generator and Outline-to-Music Generator in Symbolic Music Generation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Most musical compositions utilize repetition as a fundamental element to create captivating aesthetic experiences. However, the potential of repetition in machine-learning-based … |
ZHEJING HU et. al. | IEEE Transactions on Multimedia | 2024-01-01 |
| 991 | Musical Genre Classification Using Advanced Audio Analysis and Deep Learning Techniques Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Classifying music genres has been a significant problem in the decade of seamless music streaming platforms and countless content creations. An accurate music genre classification … |
MUMTAHINA AHMED et. al. | IEEE Open Journal of the Computer Society | 2024-01-01 |
| 992 | Predicting Song Popularity Through Machine Learning and Sentiment Analysis on Social Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
G. Rompolas; Athanasios Smpoukis; Eleanna Kafeza; Christos Makris; | Artificial Intelligence Applications and Innovations | 2024-01-01 |
| 993 | Research on Transformer Partial Discharge Fault Location Based on Improved UCA-RB-MUSIC Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Aiming at the problem that the coherent signal cannot be estimated when the traditional UCA-RB-MUSIC algorithm is used to detect the partial discharge of the transformer, an … |
Yanling Lv; Kexian Ai; Feng Guo; | IEEE Access | 2024-01-01 |
| 994 | Drawlody: Sketch-Based Melody Creation With Enhanced Usability and Interpretability Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Sketch-based melody creation systems enable people to compose melodies by converting human-sketched melody contours into coherent melodies that fit the depicted contours. This … |
Qihao Liang; Ye Wang; | IEEE Transactions on Multimedia | 2024-01-01 |
| 995 | A Short Survey and Comparison of CNN-Based Music Genre Classification Using Multiple Spectral Features IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The goal of music genre classification is to identify the genre of given feature vectors representing certain characteristics of music clips. In addition, to improve the accuracy … |
W. Seo; Sung-Hyun Cho; Paweł Teisseyre; Jaesung Lee; | IEEE Access | 2024-01-01 |
| 996 | MusicTalk: A Microservice Approach for Musical Instrument Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Musical instrument recognition is the process of using machine learning or audio signal processing to identify and classify different musical instruments from an audio recording. … |
Yi-Bing Lin; Chang-Chieh Cheng; Shih-Chuan Chiu; | IEEE Open Journal of the Computer Society | 2024-01-01 |
| 997 | Self-Supervised Learning of Multi-Level Audio Representations for Music Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The task of music structure analysis refers to automatically identifying the location and the nature of musical sections within a song. In the supervised scenario, structural … |
Morgan Buisson; Brian McFee; S. Essid; H. Crayencour; | IEEE/ACM Transactions on Audio, Speech, and Language … | 2024-01-01 |
| 998 | EXPLORE — Explainable Song Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a novel approach combining advanced algorithms and an interactive user interface. |
Abhinav Arun; Mehul Soni; Palash Choudhary; Saksham Arora; | arxiv-cs.IR | 2023-12-30 |
| 999 | Deep Neural Network Architectures for Audio Emotion Recognition Performed on Song and Speech Modalities Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Souha Ayadi; Z. Lachiri; | Int. J. Speech Technol. | 2023-12-28 |
| 1000 | EnchantDance: Unveiling The Potential of Music-Driven Dance Movement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce the EnchantDance framework, a state-of-the-art method for dance generation. |
BO HAN et. al. | arxiv-cs.SD | 2023-12-26 |