Paper Digest: Recent Papers on Question Answering
Paper Digest Team extracted all recent Question Answering related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
As a pioneer in the field since 2018, Paper Digest has curated thousands of such lists, drawing on years of accumulated data across decades of conferences and research topics.To ensure you never miss a breakthrough, our daily service sifts through tens of thousands of new papers, clinical trials, news articles, community posts every day – delivering only what matters most to your specific interests. Beyond discovery, Paper Digest offers built-in research tools to help users read articles, write articles, get answers, conduct literature reviews, and generate research reports more efficiently.
Paper Digest Team
New York City, New York, 10017
TABLE 1: Paper Digest: Recent Papers on Question Answering
| Paper | Author(s) | Source | Date | |
|---|---|---|---|---|
| 1 | Helpful or Harmful? Re-Evaluating Frugality in Retrieval-Augmented Generation for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a frugality-based evaluation framework that jointly assesses accuracy improvements and computational cost to determine when retrieval-augmented generation is beneficial in medical question answering, rather than evaluating retrieval effectiveness through accuracy alone. |
Richard Coric; Ebenezer F. Oloyede; Heriberto Cuayáhuitl; | Machine Learning and Knowledge Extraction | 2026-03-06 |
| 2 | RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: They are also only validated in limited settings, leaving it unclear how reliably they handle the shifts encountered in real-world settings. To address these limitations, we introduce RAMoEA-QA, a hierarchically routed generative model for respiratory audio question answering that unifies multiple question types and supports both discrete and continuous targets within a single multimodal system. |
Gaia A. Bertolino; Yuwei Zhang; Tong Xia; Domenico Talia; Cecilia Mascolo; | arxiv-cs.SD | 2026-03-06 |
| 3 | NCTB-QA: A Large-Scale Bangla Educational Question Answering Dataset and Benchmarking Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These systems tend to produce unreliable responses when correct answers are absent from context. To solve this problem, we introduce NCTB-QA, a large-scale Bangla question answering dataset comprising 87,805 question-answer pairs extracted from 50 textbooks published by Bangladesh’s National Curriculum and Textbook Board. |
Abrar Eyasir; Tahsin Ahmed; Muhammad Ibrahim; | arxiv-cs.CL | 2026-03-05 |
| 4 | Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose CoCA(Co-optimized Confidence and Answers), a GRPO reinforcement learning framework that jointly optimizes confidence calibration and answer accuracy via segmented credit assignment. |
CHANGCHENG LI et. al. | arxiv-cs.CL | 2026-03-05 |
| 5 | Who Judges The Judge? Evaluating LLM-as-a-Judge for French Medical Open-ended QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate whether large language models (LLMs) can act as judges of semantic equivalence in French medical OEQA, comparing closed-access, general-purpose, and biomedical domain-adapted models. |
Ikram Belmadani; Oumaima El Khettari; Pacôme Constant dit Beaufils; Richard Dufour; Benoit Favre; | arxiv-cs.CL | 2026-03-04 |
| 6 | FocusGraph: Graph-Structured Frame Selection for Embodied Long Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop FocusGraph, a framework for keyframe selection for question answering over long egocentric videos. |
TATIANA ZEMSKOVA et. al. | arxiv-cs.CV | 2026-03-04 |
| 7 | RAG-X: Systematic Diagnosis of Retrieval-Augmented Generation for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These approaches fail to diagnose whether an error stems from faulty retrieval or flawed generation, limiting developers from performing targeted improvement. To address this gap, we propose RAG-X, a diagnostic framework that evaluates the retriever and generator independently across a triad of QA tasks: information extraction, short-answer generation, and multiple-choice question (MCQ) answering. |
Aswini Sivakumar; Vijayan Sugumaran; Yao Qiang; | arxiv-cs.CL | 2026-03-03 |
| 8 | KGLMQA: Enhancing Medical Visual Question Answering with Knowledge Graphs and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing models frequently encounter challenges such as restricted multimodal interaction, insufficient guidance from external medical knowledge, and a lack of rigorous diagnostic logic in their responses. To address these issues, we propose KGLMQA, a novel framework that integrates knowledge graphs with Large Language Models (LLMs). |
Wenhu Wang; Huina Liu; Changfa Wei; | PeerJ Computer Science | 2026-03-03 |
| 9 | ORCA: Orchestrated Reasoning with Collaborative Agents for Document Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current approaches struggle to decompose intricate questions into manageable sub-tasks and often fail to leverage specialized processing paths for different document elements. We present ORCA: Orchestrated Reasoning with Collaborative Agents for Document Visual Question Answering, a novel multi-agent framework that addresses these limitations through strategic agent coordination and iterative refinement. |
Aymen Lassoued; Mohamed Ali Souibgui; Yousri Kessentini; | arxiv-cs.CV | 2026-03-02 |
| 10 | A Self-Reflection Mechanism for Reducing Hallucination in Vietnamese Legal Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a self-reflection mechanism that adds an iterative generate–evaluate–refine loop to a Graph-RAG pipeline for Vietnamese labor-law questions. |
THI VUONG PHAM et. al. | Scientific Journal of Computer Science | 2026-03-02 |
| 11 | Let The Agent Search: Autonomous Exploration Beats Rigid Workflows in Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that simply granting an off-the-shelf LLM autonomy, that is, letting it decide what to do next, already yields substantial gains even in a strict zero-shot setting. Building on this insight, we propose AT2QA, an autonomous, training-free agent for temporal question answering that iteratively interacts with the temporal knowledge graph via a general search tool for dynamic retrieval. |
Xufei Lv; Jiahui Yang; Yifu Gao; Linbo Qiao; Houde Liu; | arxiv-cs.CL | 2026-03-02 |
| 12 | Co-MedGraphRAG: A Collaborative Large–Small Model Medical Question-Answering Framework Enhanced By Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work serves as a reference for researchers and developers designing medical question-answering frameworks and exploring decision-support applications. |
Sizhe Chen; Tao Chen; | Information | 2026-03-02 |
| 13 | DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deep-research agents are capable of executing multi-step web exploration, targeted retrieval, and sophisticated question answering. Despite their powerful capabilities, … |
TONGZHOU WU et. al. | arxiv-cs.AI | 2026-03-01 |
| 14 | Act Like A Pathologist: Tissue-Aware Whole Slide Image Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we try to bring models closer to how humans actually examine slides. |
WENTAO HUANG et. al. | arxiv-cs.CV | 2026-02-28 |
| 15 | LFQA-HP-1M: A Large-Scale Human Preference Dataset for Long-Form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present LFQA-HP-1M, a large-scale dataset comprising 1.3M human pairwise preference annotations for LFQA. |
Rafid Ishrak Jahan; Fahmid Shahriar Iqbal; Sagnik Ray Choudhury; | arxiv-cs.CL | 2026-02-26 |
| 16 | SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA Over Text and Tables Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present SPARTA, an end-to-end construction framework that automatically generates large-scale Table-Text QA benchmarks with lightweight human validation, requiring only one quarter of the annotation time of HybridQA. |
Sungho Park; Jueun Kim; Wook-Shin Han; | arxiv-cs.CL | 2026-02-26 |
| 17 | FHIRPath-QA: Executable Question Answering Over FHIR Electronic Health Records Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce FHIRPath-QA, the first open dataset and benchmark for patient-specific QA that includes open-standard FHIRPath queries over real-world clinical data. |
Michael Frew; Nishit Bheda; Bryan Tripp; | arxiv-cs.CL | 2026-02-26 |
| 18 | PATRA: Pattern-Aware Alignment and Balanced Reasoning for Time Series Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing LLM-based approaches exhibit two limitations: they often treat time series merely as text or images, failing to capture the patterns like trends and seasonalities needed to answer specific questions; and when trained on a mix of simple and complex tasks, simpler objectives often dominate the learning process, hindering the development of deep reasoning capabilities. To address these limitations, we propose the Pattern-Aware Alignment and Balanced Reasoning model (PATRA), introducing a pattern-aware mechanism that extracts trend and seasonality patterns from time series to achieve deep alignment. |
JUNKAI LU et. al. | arxiv-cs.AI | 2026-02-26 |
| 19 | MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present MLLM-Sampler Joint Evolution (MSJoE), a novel framework that jointly evolves the MLLM and a lightweight key-frame sampler for efficient long-form video understanding. |
WENHUI TAN et. al. | arxiv-cs.CV | 2026-02-26 |
| 20 | A Novel Multi-modal Attentional Collaborative Learning Framework with Semantic Enhancement for Audio–visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JIE YANG et. al. | Engineering Applications of Artificial Intelligence | 2026-02-25 |
| 21 | Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a reinforced co-adaptation framework that jointly optimizes retrieval and generation through two components: (1) Graph-aware Retrieval (GraphRAG), which models entity-relation structure over a high-citation knowledge subgraph for multi-hop, domain-specific evidence selection; and (2) evidence-constrained reinforcement learning via Group Relative Policy Optimization (GRPO) with multi-dimensional rewards covering faithfulness, style compliance, safety, and URL validity. |
WENWEI LI et. al. | arxiv-cs.CL | 2026-02-25 |
| 22 | A Dataset for Addressing Patient’s Information Needs Related to Clinical Course of Hospitalization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, robust datasets to assess the factuality and relevance of AI-generated responses are lacking and, to our knowledge, none capture patient information needs in the context of their EHRs. To address this gap, we introduce ArchEHR-QA, an expert-annotated dataset of 134 cases from intensive care unit and emergency department settings. |
Sarvesh Soni; Dina Demner-Fushman; | Scientific Data | 2026-02-25 |
| 23 | LiCQA : A Lightweight Complex Question Answering System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present LiCQA, an unsupervised question answer- ing model that works primarily on the basis of corpus evidence. |
Sourav Saha; Dwaipayan Roy; Mandar Mitra; | arxiv-cs.CL | 2026-02-25 |
| 24 | DynamicGTR: Leveraging Graph Topology Representation Preferences to Boost VLM Capabilities on Graph QAs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This “one-size-fits-all” strategy often neglects model-specific and task-specific preferences, resulting in inaccurate or over-lengthy responses to graph-related queries. To address this, we propose the $\mbox{DynamicGTR}$ framework, which dynamically selects the optimal GTR for each query during inference, thereby enhancing the zero-shot graph QA capabilities of VLMs with a customizable accuracy and brevity trade-off. |
YANBIN WEI et. al. | arxiv-cs.CV | 2026-02-25 |
| 25 | Exploring Multimodal LMMs for Online Episodic Memory Question Answering on The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate the feasibility of using Multimodal Large Language Models (MLLMs) for real-time online episodic memory question answering. |
Giuseppe Lando; Rosario Forte; Antonino Furnari; | arxiv-cs.CV | 2026-02-25 |
| 26 | GATES: Self-Distillation Under Privileged Context with Consensus Gating Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study self-distillation in settings where supervision is unreliable: there are no ground truth labels, verifiable rewards, or external graders to evaluate answers. |
Alex Stein; Furong Huang; Tom Goldstein; | arxiv-cs.LG | 2026-02-24 |
| 27 | Retrieval-Augmented Generation for Multi-Hop Question Answering Based on Structured Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As the number of retrieval iterations increases, the generated queries can gradually drift from the correct reasoning path, and irrelevant or noisy information may accumulate, ultimately reducing reasoning accuracy. To address these challenges, we propose a novel retrieval-augmented generation method for multi-hop question answering based on structured planning. |
Yujiao Huang; Ling Yang; Xu-Hua Yang; Xinli Xu; | ACM Transactions on Knowledge Discovery from Data | 2026-02-23 |
| 28 | To Reason or Not To: Selective Chain-of-Thought in Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Methods: We propose Selective Chain-of-Thought (Selective CoT), an inference-time strategy that first predicts whether a question requires reasoning and generates a rationale only when needed. |
ZAIFU ZHAN et. al. | arxiv-cs.CL | 2026-02-23 |
| 29 | Temporal-Aware Heterogeneous Graph Reasoning with Multi-View Fusion for Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel framework with temporal-aware question encoding, multi-hop graph reasoning, and multi-view heterogeneous information fusion. |
WUZHENGHONG WEN et. al. | arxiv-cs.CL | 2026-02-23 |
| 30 | Efficient Multimodal Learning Using BERT and Vision Transformers for Visual Question Answering on Peripheral Blood Cells Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Faheem Shehzad; Ciro Mennella; Massimo Esposito; Aniello Minutolo; | Discover Artificial Intelligence | 2026-02-22 |
| 31 | Learning to Reason for Multi-Step Retrieval of Personal Context in Personalized Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose PR2 (Personalized Retrieval-Augmented Reasoning), a reinforcement learning framework that integrates reasoning and retrieval from personal context for personalization. |
Maryam Amirizaniani; Alireza Salemi; Hamed Zamani; | arxiv-cs.CL | 2026-02-22 |
| 32 | A Large-scale Benchmark for Evaluating Large Language Models on Medical Question Answering in Romanian Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
ANA-CRISTINA ROGOZ et. al. | npj Digital Medicine | 2026-02-21 |
| 33 | Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing benchmarks do not evaluate such conditional reasoning, and retrieval-augmented or graph-based methods lack explicit mechanisms to ensure that retrieved knowledge is applicable to given context. To address this gap, we propose CondMedQA, the first benchmark for conditional biomedical QA, consisting of multi-hop questions whose answers vary with patient conditions. |
JASH RAJESH PAREKH et. al. | arxiv-cs.CL | 2026-02-19 |
| 34 | Decomposing Retrieval Failures in RAG for Long-Document Financial Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Across methods, gains in document discovery tend to translate into stronger page recall, yet oracle performance still suggests headroom for page and chunk level retrieval. To target this gap, we introduce a domain fine-tuned page scorer that treats pages as an intermediate retrieval unit between documents and chunks. |
Amine Kobeissi; Philippe Langlais; | arxiv-cs.CL | 2026-02-19 |
| 35 | InsQABench: Benchmarking Chinese Insurance Domain Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
LAI WEI et. al. | Information Processing & Management | 2026-02-19 |
| 36 | Robustness and Reasoning Fidelity of Large Language Models in Long-Context Code Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conduct a systematic study of long-context code question answering using controlled ablations that test sensitivity to answer format, distractors, and context scale. |
Kishan Maharaj; Nandakishore Menon; Ashita Saxena; Srikanth Tamilselvam; | arxiv-cs.SE | 2026-02-19 |
| 37 | Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we address this research gap in Greek QA by contributing: (i) DemosQA, a novel dataset, which is constructed using social media user questions and community-reviewed answers to better capture the Greek social and cultural zeitgeist; (ii) a memory-efficient LLM evaluation framework adaptable to diverse QA datasets and languages; and (iii) an extensive evaluation of 11 monolingual and multilingual LLMs on 6 human-curated Greek QA datasets using 3 different prompting strategies. |
Charalampos Mastrokostas; Nikolaos Giarelis; Nikos Karacapilidis; | arxiv-cs.CL | 2026-02-18 |
| 38 | BanglaSummEval: Reference-Free Factual Consistency Evaluation for Bangla Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce BanglaSummEval, a reference-free, question-answering-based framework for evaluating factual consistency in Bangla summarization. |
Ahmed Rafid; Rumman Adib; Fariya Ahmed; Ajwad Abrar; Mohammed Saidul Islam; | arxiv-cs.CL | 2026-02-18 |
| 39 | Uncertainty As Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on UQ for the contextual QA task and propose a theoretically grounded approach to quantify \emph{epistemic uncertainty}. |
YAVUZ FARUK BAKMAN et. al. | iclr | 2026-02-17 |
| 40 | Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a two-pillar framework, LiteCoST, to achieve both high accuracy and low latency with small language models (SLMs). |
ZHUOWEN LIANG et. al. | iclr | 2026-02-17 |
| 41 | A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present A$^2$Search, an annotation-free, end-to-end training framework to recognize and handle ambiguity. |
FENGJI ZHANG et. al. | iclr | 2026-02-17 |
| 42 | Improving MLLMs in Embodied Exploration and Question Answering with Human-Inspired Memory Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a non-parametric memory framework that explicitly disentangles episodic and semantic memory for embodied exploration and question answering. |
Ji Li; Jing Xia; Mingyi Li; Shiyan Hu; | arxiv-cs.RO | 2026-02-17 |
| 43 | Same Content, Different Representations: A Controlled Study for Table QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To support detailed analysis, we introduce a diagnostic benchmark with splits along table size, join requirements, query complexity, and schema quality. |
Yue Zhang; Seiji Maekawa; Nikita Bhutani; | iclr | 2026-02-17 |
| 44 | A Structured, Tagged, and Localized Visual Question Answering Dataset with Full Sentence Answers and Scene Graphs for Chest X-ray Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these limitations, we introduce MIMIC-Ext-CXR-QBA (abbr.We automatically generated our VQA dataset from scene graphs (also made available), which we constructed using LLM-based information extraction from radiology reports. |
Philip Müller; Friederike Jungmann; Georgios Kaissis; Daniel Rueckert; | iclr | 2026-02-17 |
| 45 | AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by how humans link information associatively, we propose AssoMem, a novel framework constructing an associative memory graph that anchors dialogue utterances to automatically extracted clues. |
KAI ZHANG et. al. | iclr | 2026-02-17 |
| 46 | Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our results reveal a notable performance gap between human level scores and VLM performance, highlighting that current VLMs still fall short of human level spatial understanding (SU). To bridge this gap, we propose Ego3D-VLM, a post-training framework that enhances 3D spatial reasoning of VLMs. |
MOHSEN GHOLAMI et. al. | iclr | 2026-02-17 |
| 47 | MoL: Adaptive Mixture-of-Length Reasoning for Efficient Question Answering with Context Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Mixture-of-Length (MoL), an approach for Question Answering (QA) with context that aims to improve the balance between reasoning quality and response efficiency. |
GUOCONG LI et. al. | iclr | 2026-02-17 |
| 48 | VidBridge-R1: Bridging QA and Captioning for RL-based Video Understanding Models with Intermediate Proxy Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Naively combining reward signals from these tasks results in mutual performance degradation, which we attribute to a conflict between their opposing task natures. To address this challenge, we propose a novel training framework built upon two intermediate proxy tasks: DarkEventInfer, which presents videos with masked event segments, requiring models to infer the obscured content based on contextual video cues; and MixVidQA, which presents interleaved video sequences composed of two distinct clips, challenging models to isolate and reason about one while disregarding the other. |
XINLONG CHEN et. al. | iclr | 2026-02-17 |
| 49 | Choices Speak Louder Than Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new scoring method called **Normalized Probability Shift by the Question (NPSQ)**, designed to isolate the impact of the question itself and provide a more reliable assessment of comprehension. |
Gyeongje Cho; Yeonkyoung So; Jaejin Lee; | iclr | 2026-02-17 |
| 50 | FrugalRAG: Less Is More in RL Finetuning for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose FrugalRAG, a two-stage finetuning framework that adaptively _reduces_ the number of retrieval steps based on a question’s difficulty. |
Abhinav Java; Srivathsan Koundinyan; Nagarajan Natarajan; Amit Sharma; | iclr | 2026-02-17 |
| 51 | Are LLMs Really Not Knowledgeable? Mining The Submerged Knowledge in LLMs’ Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By analyzing the token-level output distributions, we find that correct answers often appear among high-probability candidates, despite not being selected. Motivated by this, we propose Hits@k, a novel metric to evaluate latent knowledge retention independent of answer surface form. |
Xingjian Tao; Yiwei Wang; Yujun Cai; Zhicheng Yang; Jing Tang; | iclr | 2026-02-17 |
| 52 | QuRL: Rubrics As Judge For Open-Ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these limitations, we introduce a schema for generating case-wise rubrics that are question-specific, content-based and stylistically sensitive, thereby evaluating both factual soundness and writing quality. Building on this schema, we propose QuRL (Open-Ended QA with Rubric-guided Reinforcement Learning), a framework that automatically mines rubrics for each question from easily accessible online sources and leverages them as reward signals. |
Xiyu Wei; Qingwei Zong; Xiaoguang Li; Eugene J. Yu; Sujian Li; | iclr | 2026-02-17 |
| 53 | FAST-EQA: Efficient Embodied Question Answering with Global and Local Region Relevancy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce FAST-EQA, a question-conditioned framework that (i) identifies likely visual targets, (ii) scores global regions of interest to guide navigation, and (iii) employs Chain-of-Thought (CoT) reasoning over visual memory to answer confidently. |
Haochen Zhang; Nirav Savaliya; Faizan Siddiqui; Enna Sachdeva; | arxiv-cs.RO | 2026-02-17 |
| 54 | CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present CounselBench, a large-scale benchmark developed with 100 mental health professionals to evaluate and stress-test large language models (LLMs) in realistic help-seeking scenarios. |
YAHAN LI et. al. | iclr | 2026-02-17 |
| 55 | M4PQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose M4PQA, a human-annotated comprehensive paper QA dataset in the field of artificial intelligence, with 13,948 papers and 1,246 questions, that encompasses multi-task, multi-modal and instance-level evaluation. |
TIANCHENG HUANG et. al. | iclr | 2026-02-17 |
| 56 | Query-Guided Spatial–Temporal–Frequency Interaction for Music Audio–Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in those methods, the audio input is primarily treated as complementary to video analysis, and the textual question information contributes minimally to audio–visual understanding, as it is typically integrated only in the final stages of reasoning. To address these limitations, we propose a novel Query-guided Spatial–Temporal–Frequency (QSTar) interaction method, which effectively incorporates question-guided clues and exploits the distinctive frequency-domain characteristics of audio signals, alongside spatial and temporal perception, to enhance audio–visual understanding. |
Kun Li; Michael Ying Yang; Sami Sebastian Brandt; | iclr | 2026-02-17 |
| 57 | Harnessing Temporal Databases for Systematic Evaluation of Factual Time-Sensitive Question-Answering in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although factual Time-Sensitive Question-Answering (TSQA) tasks have been widely developed, existing benchmarks often face manual bottlenecks that limit scalable and comprehensive TSQA evaluation. To address this issue, we propose TDBench, a new benchmark that systematically constructs TSQA pairs by harnessing temporal databases and database techniques, such as temporal functional dependencies, temporal SQL, and temporal joins. |
Soyeon Kim; Jindong Wang; Xing Xie; Steven Euijong Whang; | iclr | 2026-02-17 |
| 58 | AQuA: Toward Strategic Response Generation for Ambiguous Visual Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Ambiguous Visual Question Answering (AQuA), a fine-grained dataset that classifies ambiguous VQA instances into four levels according to the nature and degree of ambiguity, along with the optimal response strategy for each case. |
Jihyoung Jang; Hyounghun Kim; | iclr | 2026-02-17 |
| 59 | IKIA: Image-Knowledge Internalization Assistance Model for Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yurun Bi; Xingang Wang; Yuteng Xiao; Yudong Zhang; | International Journal of Machine Learning and Cybernetics | 2026-02-16 |
| 60 | KenLumachiQuAD – A Question Answering Dataset for Kenyan Luhya Lumarachi Language for Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Question-answering (QA) datasets play a crucial role in testing and training machine learning models, from which we can develop practical end-user applications, such as internet search, dialogue systems, and chatbots. |
Barack Wamkaya Wanjawa; Lawrence Muchemi; Evans Miriti; | East African Journal of Information Technology | 2026-02-16 |
| 61 | Automating Construction Contract Question Answering Using Large Language Model and Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
MINGYU ZHANG et. al. | Expert Syst. Appl. | |
| 62 | Index Light, Reason Deep: Deferred Visual Ingestion for Visual-Dense Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes the Deferred Visual Ingestion (DVI) framework, adopting a demand-side ingestion strategy: the indexing phase performs only lightweight metadata extraction, deferring visual understanding to the moment users pose specific questions. |
Tao Xu; | arxiv-cs.CL | 2026-02-15 |
| 63 | SRA: Semantic Relation-Aware Flowchart Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the issue, we propose a novel Semantic Relation-Aware (SRA) FlowchartQA approach. |
Xinyu Li; Bowei Zou; Yuchong Chen; Yifan Fan; Yu Hong; | arxiv-cs.MM | 2026-02-14 |
| 64 | TraceBack: Multi-Agent Decomposition for Fine-Grained Table Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing table QA systems rarely provide fine-grained attribution, so even correct answers often lack verifiable grounding, limiting trust in high-stakes settings. We address this with TraceBack, a modular multi-agent framework for scalable, cell-level attribution in single-table QA. |
TEJAS ANVEKAR et. al. | arxiv-cs.CL | 2026-02-13 |
| 65 | RoadscapesQA: A Multitask, Multimodal Dataset for Visual Question Answering on Indian Roads Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we describe the data collection and annotation process, present key dataset statistics, and provide initial baselines for image QA tasks using vision-language models. |
Vijayasri Iyer; Maahin Rathinagiriswaran; Jyothikamalesh S; | arxiv-cs.CV | 2026-02-13 |
| 66 | Who Is The Richest Club in The Championship? Detecting and Rewriting Underspecified Questions Improve QA Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We argue that this gap is partly due to underspecified questions – queries whose interpretation cannot be uniquely determined without additional context. To test this hypothesis, we introduce an LLM-based classifier to identify underspecified questions and apply it to several widely used QA datasets, finding that 16% to over 50% of benchmark questions are underspecified and that LLMs perform significantly worse on them. |
Yunchong Huang; Gianni Barlacchi; Sandro Pezzelle; | arxiv-cs.CL | 2026-02-12 |
| 67 | RSHallu: Dual-Mode Hallucination Evaluation for Remote-Sensing Multimodal Large Language Models with Domain-Tailored Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Multimodal large language models (MLLMs) are increasingly adopted in remote sensing (RS) and have shown strong performance on tasks such as RS visual grounding (RSVG), RS visual question answering (RSVQA), and multimodal dialogue. |
ZIHUI ZHOU et. al. | arxiv-cs.CV | 2026-02-11 |
| 68 | MultiCube-RAG for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Built on the cube structure, we propose MultiCube-RAG, a training-free method consisting of multiple cubes for multi-step reasoning and retrieval. |
JIMENG SHI et. al. | arxiv-cs.CL | 2026-02-11 |
| 69 | The CLEF-2026 FinMMEval Lab: Multilingual and Multimodal Evaluation of Financial AI Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the setup and the tasks of the FinMMEval Lab at CLEF 2026, which introduces the first multilingual and multimodal evaluation framework for financial Large Language Models (LLMs). |
ZHUOHAN XIE et. al. | arxiv-cs.CL | 2026-02-11 |
| 70 | PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce PulseLM, a large-scale PPG-text dataset designed to bridge raw PPG waveforms and natural language through a unified, closed-ended question answering (QA) formulation. |
HUNG MANH PHAM et. al. | arxiv-cs.CL | 2026-02-10 |
| 71 | Comprehensive Comparison of RAG Methods Across Multi-Domain Conversational QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a comprehensive empirical study of vanilla and advanced RAG methods across eight diverse conversational QA datasets spanning multiple domains. |
Klejda Alushi; Jan Strich; Chris Biemann; Martin Semmann; | arxiv-cs.CL | 2026-02-10 |
| 72 | AnalyticsGPT: An LLM Workflow for Scientometric Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces AnalyticsGPT, an intuitive and efficient large language model (LLM)-powered workflow for scientometric question answering. |
Khang Ly; Georgios Cheirmpos; Adrian Raudaschl; Christopher James; Seyed Amin Tabatabaei; | arxiv-cs.CL | 2026-02-10 |
| 73 | Engineering Trustworthy Retrieval-Augmented Generation for EU Electricity Market Regulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents the design of a domain-specific Retrieval-Augmented Generation (RAG) system for EU electricity market regulations, explicitly engineered to deliver source-grounded, traceable and low-hallucination answers. |
Șener Ali; Simona-Vasilica Oprea; Adela Bâra; | Electronics | 2026-02-10 |
| 74 | CAPID: Context-Aware PII Detection for Question-Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To achieve privacy-preserving PII detection, we propose CAPID, a practical approach that fine-tunes a locally owned small language model (SLM) that filters sensitive information before it is passed to LLMs for QA. |
MARIIA PONOMARENKO et. al. | arxiv-cs.CR | 2026-02-10 |
| 75 | Vista: Scene-Aware Optimization for Streaming Video Question Answering Under Post-Hoc Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Vista, a novel framework for scene-aware streaming video QA that enables efficient and scalable reasoning over continuous video streams. |
HAOCHENG LU et. al. | arxiv-cs.CV | 2026-02-09 |
| 76 | The Development and Evaluation of Agricultural Question-answering Systems Based on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ayşe Eldem; Hüseyin Eldem; | Scientific Reports | 2026-02-09 |
| 77 | CoRect: Context-Aware Logit Contrast for Hidden State Rectification to Resolve Knowledge Conflicts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through layer-wise analysis, we attribute this failure to a parametric suppression phenomenon: specifically, in deep layers, certain FFN layers overwrite context-sensitive representations with memorized priors. To address this, we propose CoRect (Context-Aware Logit Contrast for Hidden State Rectification). |
Xuhua Ma; Richong Zhang; Zhijie Nie; | arxiv-cs.CL | 2026-02-08 |
| 78 | Long-Context Long-Form Question Answering for Legal Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we address the challenges of long-context question answering in context of long-form answers given the idiosyncrasies of legal documents. |
ANAGHA KULKARNI et. al. | arxiv-cs.CL | 2026-02-06 |
| 79 | ViHERMES: A Graph-Grounded Multihop Question Answering Benchmark and System for Vietnamese Healthcare Regulations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce the Vietnamese Healthcare Regulations-Multihop Reasoning Dataset (ViHERMES), a benchmark designed for multihop QA over Vietnamese healthcare regulatory documents. |
LONG S. T. NGUYEN et. al. | arxiv-cs.CL | 2026-02-06 |
| 80 | CompactRAG: Reducing LLM Calls and Token Overhead in Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose CompactRAG, a simple yet effective framework that decouples offline corpus restructuring from online reasoning. |
HAO YANG et. al. | arxiv-cs.CL | 2026-02-05 |
| 81 | MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract This work introduces MediQAl, a French medical question answering dataset designed to evaluate the capabilities of language models in factual medical recall and reasoning over real-world clinical scenarios. |
Adrien Bazoge; | Scientific Data | 2026-02-05 |
| 82 | Cross-Lingual Empirical Evaluation of Large Language Models for Arabic Medical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we conduct a cross-lingual empirical analysis of LLM performance on Arabic and English medical question and answering. |
Chaimae Abouzahir; Congbo Ma; Nizar Habash; Farah E. Shamout; | arxiv-cs.CL | 2026-02-05 |
| 83 | IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce IRPAPERS, a benchmark of 3,230 pages from 166 scientific papers, with both an image and an OCR transcription for each page. |
CONNOR SHORTEN et. al. | arxiv-cs.IR | 2026-02-05 |
| 84 | RA-QA: Towards Respiratory Audio-based Health Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This new data resource contains about 7.5 million QA pairs spanning more than 60 attributes and three question types: single verification, multiple choice, and open-ended questions. Building upon this dataset, we introduce a novel benchmark that compares audio-text generation models with traditional audio classifiers to evaluate their respective performance. |
Gaia A. Bertolino; Yuwei Zhang; Tong Xia; Domenico Talia; Cecilia Mascolo; | arxiv-cs.SD | 2026-02-04 |
| 85 | Knowing When to Answer: Adaptive Confidence Refinement for Reliable Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a formal problem formulation for \textit{Reliable} Audio-Visual Question Answering ($\mathcal{R}$-AVQA), where we prefer abstention over answering incorrectly. |
DINH PHU TRAN et. al. | arxiv-cs.LG | 2026-02-04 |
| 86 | JSynFlow: Japanese Synthesised Flowchart Visual Question Answering Dataset Built with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, developing VLMs with precise flowchart understanding requires large-scale datasets of flowchart images and corresponding text, the creation of which is highly time-consuming. To address this challenge, we introduce JSynFlow, a synthesised visual QA dataset for Japanese flowcharts, generated using large language models (LLMs). |
Hiroshi Sasaki; | arxiv-cs.CV | 2026-02-03 |
| 87 | ST-Raptor: An Agentic System for Semi-Structured Table QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing Text-to-SQL methods typically require converting semi-structured tables into structured formats, inevitably leading to information loss, while approaches like Text-to-Code and multimodal LLM-based QA struggle with complex layouts and often yield inaccurate answers. To address these limitations, we present ST-Raptor, an agentic system for semi-structured table QA. |
JINXIU QU et. al. | arxiv-cs.AI | 2026-02-03 |
| 88 | No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a new framework that systematically transforms single-hop cultural questions into multi-hop reasoning chains spanning six clue types (e.g., commonsense, temporal, geographical). |
Vynska Amalia Permadi; Xingwei Tan; Nafise Sadat Moosavi; Nikos Aletras; | arxiv-cs.CL | 2026-02-03 |
| 89 | OmniRAG-Agent: Agentic Omnimodal Reasoning for Low-Resource Long Audio-Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Long-horizon omnimodal question answering answers questions by reasoning over text, images, audio, and video. Despite recent progress on OmniLLMs, low-resource long audio-video QA still suffers from costly dense encoding, weak fine-grained retrieval, limited proactive planning, and no clear end-to-end optimization.To address these issues, we propose OmniRAG-Agent, an agentic omnimodal QA method for budgeted long audio-video reasoning. |
YIFAN ZHU et. al. | arxiv-cs.CL | 2026-02-03 |
| 90 | CRAFT: Calibrated Reasoning with Answer-Faithful Traces Via Reinforcement Learning for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional chain-of-thought generation often deviates from required structured output formats, leading to incomplete or malformed structured content. To address these challenges, we propose CRAFT (Calibrated Reasoning with Answer-Faithful Traces), a Group Relative Policy Optimization (GRPO) based reinforcement learning framework that trains models to perform faithful reasoning during response generation. |
YU LIU et. al. | arxiv-cs.CL | 2026-02-01 |
| 91 | Understanding QA Generation: Extracting Parametric and Contextual Knowledge with CQA for Low Resource Bangla Language Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our work not only introduces a novel framework for analyzing knowledge sources in Bangla QA but also uncovers critical findings that open up broader directions for counterfactual reasoning in low-resource language settings. |
Umme Abira Azmary; MD Ikramul Kayes; Swakkhar Shatabda; Farig Yousuf Sadeque; | arxiv-cs.CL | 2026-02-01 |
| 92 | EEmo-Logic: A Unified Dataset and Multi-Stage Framework for Comprehensive Image-Evoked Emotion Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing models are still limited to coarse-grained emotion perception or deficient reasoning capabilities. To bridge this gap, we introduce EEmoDB, the largest image-evoked emotion understanding dataset to date. |
LANCHENG GAO et. al. | arxiv-cs.CV | 2026-02-01 |
| 93 | PARSE: An Open-Domain Reasoning Question Answering Benchmark for Persian Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce PARSE, the first open-domain Persian reasoning QA benchmark, containing 10,800 questions across Boolean, multiple-choice, and factoid formats, with diverse reasoning types, difficulty levels, and answer structures. |
Jamshid Mozafari; Seyed Parsa Mousavinasab; Adam Jatowt; | arxiv-cs.CL | 2026-02-01 |
| 94 | Inferential Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Inferential QA — a new task that challenges models to infer answers from answer-supporting passages which provide only clues. |
Jamshid Mozafari; Hamed Zamani; Guido Zuccon; Adam Jatowt; | arxiv-cs.CL | 2026-02-01 |
| 95 | DeALOG: Decentralized Multi-Agents Log-Mediated Reasoning Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DeALOG, a decentralized multi-agent framework for multimodal question answering. |
Abhijit Chakraborty; Ashish Raj Shekhar; Shiven Agarwal; Vivek Gupta; | arxiv-cs.CL | 2026-01-31 |
| 96 | Reasoning By Commented Code for Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a commented, step-by-step code-generation framework that incorporates explicit reasoning into the Python program-generation process. |
Seho Pyo; Jiheon Seok; Jaejin Lee; | arxiv-cs.CL | 2026-01-31 |
| 97 | MedSpeak: A Knowledge Graph-Aided ASR Error Correction Framework for Spoken Medical QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose MedSpeak, a novel knowledge graph-aided ASR error correction framework that refines noisy transcripts and improves downstream answer prediction by leveraging both semantic relationships and phonetic information encoded in a medical knowledge graph, together with the reasoning power of LLMs. |
YUTONG SONG et. al. | arxiv-cs.CL | 2026-01-31 |
| 98 | Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic Data-Driven Comparative Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Small Language Models (SLMs) provide a more efficient alternative, yet their effectiveness for multi-turn customer-service QA remains underexplored, particularly in scenarios requiring dialogue continuity and contextual understanding. |
Lakshan Cooray; Deshan Sumanathilaka; Pattigadapa Venkatesh Raju; | arxiv-cs.CL | 2026-01-31 |
| 99 | Benchmarking Uncertainty Calibration in Large Language Model Long-Form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce the first large-scale benchmark for evaluating UQ metrics in reasoning-demanding QA studying calibration of UQ methods, providing an extensible open-source framework to reproducibly assess calibration. |
Philip Müller; Nicholas Popovič; Michael Färber; Peter Steinbach; | arxiv-cs.CL | 2026-01-30 |
| 100 | TSAQA: Time Series Analysis Question And Answering Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce TSAQA, a novel unified benchmark designed to broaden task coverage and evaluate diverse temporal analysis capabilities. |
BAOYU JING et. al. | arxiv-cs.AI | 2026-01-30 |
| 101 | Gender Disparities in StackOverflow’s Community-Based Question Answering: A Matter of Quantity Versus Quality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate whether answer quality is influenced by gender using a combination of human evaluations and automated assessments powered by Large Language Models. |
Maddalena Amendola; Cosimo Rulli; Carlos Castillo; Andrea Passarella; Raffaele Perego; | arxiv-cs.CY | 2026-01-30 |
| 102 | CE-GOCD: Central Entity-Guided Graph Optimization for Community Detection to Augment LLM Scientific Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This impairs the LLM’s comprehension of scientific literature, hindering the comprehensiveness and specificity of its responses. To address this, we propose Central Entity-Guided Graph Optimization for Community Detection (CE-GOCD), a method that augments LLMs’ scientific question answering by explicitly modeling and leveraging semantic substructures within academic knowledge graphs. |
JIAYIN LAN et. al. | arxiv-cs.CL | 2026-01-29 |
| 103 | Ontology-grounded Knowledge Graphs for Mitigating Hallucinations in Large Language Models for Clinical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Mohamed Ali; Zaki Taha; Mohamed Mabrouk Morsey; | Journal of Biomedical Informatics | 2026-01-28 |
| 104 | Disaster Question Answering with LoRA Efficiency and Accurate End Position Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a disaster-focused question answering system based on Japanese disaster situations and response experiences. |
Takato Yasuno; | arxiv-cs.CL | 2026-01-27 |
| 105 | MQADet: A Plug-and-play Paradigm for Enhancing Open-vocabulary Object Detection Via Multimodal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing open-vocabulary detectors often suffer from visual-textual misalignment and long-tailed category imbalance, leading to poor performance when handling objects described by complex, long-tailed textual queries. To overcome these challenges, we propose Multimodal Question Answering Detection (MQADet), a universal plug-and-play paradigm that enhances existing open-vocabulary detectors by leveraging the cross-modal reasoning capabilities of multimodal large language models (MLLMs). |
CAIXIONG LI et. al. | Scientific Reports | 2026-01-27 |
| 106 | Evaluating Reasoning Large Language Models with Human-like Thinking in Ophthalmic Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Methods We evaluated two state-of-the-art open-source reasoning LLMs (DeepSeek-R1 and QwQ-32B) and one conventional non-reasoning LLM (LLaMA-3.3-70B-Instruct) models on ophthalmology questions, assessing not only answer accuracy (ACC) but also the quality of their reasoning processes. |
ZHOUQIAN WANG et. al. | BMJ Open Ophthalmology | 2026-01-27 |
| 107 | SRCR: Faithful Structured Reasoning with Curriculum Reinforcement Learning for Explainable Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YUE FAN et. al. | Information Processing & Management | 2026-01-27 |
| 108 | Query-Guided Spatial-Temporal-Frequency Interaction for Music Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in those methods, the audio input is primarily treated as complementary to video analysis, and the textual question information contributes minimally to audio–visual understanding, as it is typically integrated only in the final stages of reasoning. To address these limitations, we propose a novel Query-guided Spatial–Temporal–Frequency (QSTar) interaction method, which effectively incorporates question-guided clues and exploits the distinctive frequency-domain characteristics of audio signals, alongside spatial and temporal perception, to enhance audio–visual understanding. |
Kun Li; Michael Ying Yang; Sami Sebastian Brandt; | arxiv-cs.CV | 2026-01-27 |
| 109 | V-Loop: Visual Logical Loop Verification for Hallucination Detection in Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent introspective detection methods, particularly uncertainty-based approaches, offer computational efficiency but are fundamentally indirect, as they estimate predictive uncertainty for an image-question pair rather than verifying the factual correctness of a specific answer. To address this limitation, we propose Visual Logical Loop Verification (V-Loop), a training-free and plug-and-play framework for hallucination detection in medical VQA. |
Mengyuan Jin; Zehui Liao; Yong Xia; | arxiv-cs.CV | 2026-01-26 |
| 110 | UniPACT: A Multimodal Framework for Prognostic Question Answering on Raw ECG and Structured EHR Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Models (LLMs) offer a powerful reasoning engine for this task but struggle to natively process these heterogeneous, non-textual data types. To address this, we propose UniPACT (Unified Prognostic Question Answering for Clinical Time-series), a unified framework for prognostic question answering that bridges this modality gap. |
Jialu Tang; Tong Xia; Yuan Lu; Aaqib Saeed; | arxiv-cs.LG | 2026-01-25 |
| 111 | Mind The Ambiguity: Aleatoric Uncertainty Quantification in LLMs for Safe Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The deployment of Large Language Models in Medical Question Answering is severely hampered by ambiguous user queries, a significant safety risk that demonstrably reduces answer accuracy in high-stakes healthcare settings. In this paper, we formalize this challenge by linking input ambiguity to aleatoric uncertainty (AU), which is the irreducible uncertainty arising from underspecified input. |
YAOKUN LIU et. al. | arxiv-cs.CL | 2026-01-23 |
| 112 | DeepEra: A Deep Evidence Reranking Agent for Scientific Retrieval-Augmented Generated Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Retrieval-Augmented Generation (RAG) enhances LLMs by incorporating knowledge from external sources, thereby providing credible evidence for scientific question answering. But existing retrieval and reranking methods remain vulnerable to passages that are semantically similar but logically irrelevant, often reducing factual reliability and amplifying hallucinations.To address this challenge, we propose a Deep Evidence Reranking Agent (DeepEra) that integrates step-by-step reasoning, enabling more precise evaluation of candidate passages beyond surface-level semantics. |
HAOTIAN CHEN et. al. | arxiv-cs.CL | 2026-01-23 |
| 113 | Beyond Factual QA: Mentorship-Oriented Question Answering Over Long-Form Multilingual Content Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MentorQA, the first multilingual dataset and evaluation framework for mentorship-focused question answering from long-form videos, comprising nearly 9,000 QA pairs from 180 hours of content across four languages. |
Parth Bhalerao; Diola Dsouza; Ruiwen Guan; Oana Ignat; | arxiv-cs.CL | 2026-01-23 |
| 114 | DF-RAG: Query-Aware Diversity for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, RAG is often challenged by reasoning-intensive question-answering (QA), since common retrieval methods like cosine similarity maximize relevance at the cost of introducing redundant content, which can reduce information recall. To address this, we introduce Diversity-Focused Retrieval-Augmented Generation (DF-RAG), which systematically incorporates diversity into the retrieval step to improve performance on complex, reasoning-intensive QA benchmarks. |
SAADAT HASAN KHAN et. al. | arxiv-cs.CL | 2026-01-23 |
| 115 | Clarify or Answer: Reinforcement Learning for Agentic VQA with Context Under-specification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose CoA(Clarify-or-Answer), an ask-or-answer agent that separately models the decision to ask or answer, and what to ask if needed. |
Zongwan Cao; Bingbing Wen; Lucy Lu Wang; | arxiv-cs.CL | 2026-01-22 |
| 116 | ManuRAG: Multi-modal Retrieval Augmented Generation for Manufacturing Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ManuRAG, an innovative multi-modal RAG framework designed for manufacturing QA, incorporating specialized techniques to improve answer accuracy, reliability, and interpretability. |
Yunqing Li; Zihan Dong; Farhad Ameri; Jianbang Zhang; | arxiv-cs.CE | 2026-01-21 |
| 117 | \textsc{LogicScore}: Fine-grained Logic Evaluation of Conciseness, Completeness, and Determinateness in Attributed Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consequently, Large Language Models (LLMs) often produce factually grounded yet logically incoherent responses with elusive deductive gaps. To mitigate this limitation, we present \textsc{LogicScore}, a unified evaluation framework that shifts the paradigm from local assessment to global reasoning scrutiny. |
ZHICHAO YAN et. al. | arxiv-cs.CL | 2026-01-21 |
| 118 | DisasterVQA: A Visual Question Answering Benchmark Dataset for Disaster Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DisasterVQA, a benchmark dataset designed for perception and reasoning in crisis contexts. |
Aisha Al-Mohannadi; Ayisha Firoz; Yin Yang; Muhammad Imran; Ferda Ofli; | arxiv-cs.CV | 2026-01-20 |
| 119 | Domain-Adaptation Through Synthetic Data: Fine-Tuning Large Language Models for German Law Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an effective method for adapting advanced LLMs to German legal question answering through a novel synthetic data generation approach. |
ALI HAMZA BASHIR et. al. | arxiv-cs.CL | 2026-01-20 |
| 120 | Knowledge-based Question Answering Using Graph Neural Networks and Contextual Language Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract This work introduces a novel question answering (QA) framework that integrates commonsense knowledge from ConceptNet with deep contextual embeddings from BERT using a graph neural network for structured reasoning. |
Mohamed Samir; Naglaa Fathy; Walaa Gad; | Scientific Reports | 2026-01-20 |
| 121 | ChartAttack: Testing The Vulnerability of LLMs to Malicious Prompting in Chart Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce ChartAttack, a novel framework for evaluating how MLLMs can be misused to generate misleading charts at scale. |
Jesus-German Ortiz-Barajas; Jonathan Tonglet; Vivek Gupta; Iryna Gurevych; | arxiv-cs.CL | 2026-01-19 |
| 122 | BioPulse-QA: A Dynamic Biomedical Question-Answering Benchmark for Evaluating Factuality, Robustness, and Bias in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: They also carry increasing risk of data leakage due to overlap with model pretraining corpora and often overlook critical dimensions such as robustness to linguistic variation and potential demographic biases. Materials and Methods: To address these gaps, we introduce BioPulse-QA, a benchmark that evaluates LLMs on answering questions from newly published biomedical documents including drug labels, trial protocols, and clinical guidelines. |
KRITI BHATTARAI et. al. | arxiv-cs.CL | 2026-01-18 |
| 123 | Augmenting Question Answering with A Hybrid RAG Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Structured-Semantic RAG (SSRAG), a hybrid architecture that enhances QA quality by integrating query augmentation, agentic routing, and a structured retrieval mechanism combining vector and graph based techniques with context unification. |
TIANYI YANG et. al. | arxiv-cs.CL | 2026-01-18 |
| 124 | AVIR: Adaptive Visual In-Document Retrieval for Efficient Multi-Page Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Multi-page Document Visual Question Answering (MP-DocVQA) remains challenging because long documents not only strain computational resources but also reduce the effectiveness of the attention mechanism in large vision-language models (LVLMs). We tackle these issues with an Adaptive Visual In-document Retrieval (AVIR) framework. |
Zongmin Li; Yachuan Li; Lei Kang; Dimosthenis Karatzas; Wenkang Ma; | arxiv-cs.CV | 2026-01-17 |
| 125 | SolarGPT-QA: A Domain-Adaptive Large Language Model for Educational Question Answering in Space Weather and Heliophysics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce SolarGPT-QA, a question answering system based on a domain-adapted large language model built on the LLaMA-3 base model. |
Santosh Chapagain; MohammadReza EskandariNasab; Onur Vural; Shah Muhammad Hamdi; Soukaina Filali Boubrahimi; | arxiv-cs.LG | 2026-01-17 |
| 126 | Reasoning in Trees: Improving Retrieval-Augmented Generation for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For multi-hop QA tasks, current iterative approaches predominantly rely on LLMs to self-guide and plan multi-step exploration paths during retrieval, leading to substantial challenges in maintaining reasoning coherence across steps from inaccurate query decomposition and error propagation. To address these issues, we introduce Reasoning Tree Guided RAG (RT-RAG), a novel hierarchical framework for complex multi-hop QA. |
YULING SHI et. al. | arxiv-cs.CL | 2026-01-16 |
| 127 | A Topic-aware Evaluation of ChatGPT’s Semantic Alignment with Community Answers Using BERTScore and BERTopic Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a topic-sensitive evaluation framework that enhances understanding of large language model behavior in real-world QA scenarios and supports the development of more effective and explainable conversational artificial intelligence (AI) systems. |
Mashael M. Alsulami; | PeerJ Computer Science | 2026-01-16 |
| 128 | From Single to Multi-Agent Reasoning: Advancing GeneGPT for Genomics QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We replicate GeneGPT and propose GenomAgent, a multi-agent framework that efficiently coordinates specialized agents for complex genomics queries. |
Kimia Abedini; Farzad Shami; Gianmaria Silvello; | arxiv-cs.AI | 2026-01-15 |
| 129 | KG-ViP: Bridging Knowledge Grounding and Visual Perception in Multi-modal LLMs for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, prior works typically treat them in isolation, overlooking their synergistic potential. To bridge this gap, we propose KG-ViP, a unified framework that empowers MLLMs by fusing scene graphs and commonsense graphs. |
Zhiyang Li; Ao Ke; Yukun Cao; Xike Xie; | arxiv-cs.CV | 2026-01-14 |
| 130 | An Electronic Product Carbon Footprint Dataset for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work establishes a foundation for training advanced language models to automate aggregation and standardization of emissions data for ICT systems. |
Kaiwen Zhao; Ajesh Koyatan Chathoth; Bharathan Balaji; Stephen Lee; | Scientific Data | 2026-01-14 |
| 131 | ReGraM: Region-First Knowledge Graph Reasoning for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We argue that the core challenge lies not in expanding access to knowledge, but in identifying and reasoning over the appropriate subset of evidence for each query. |
Chaerin Lee; Sohee Park; Hyunsik Na; Daseon Choi; | arxiv-cs.CL | 2026-01-14 |
| 132 | EHRNavigator: A Multi-Agent System for Patient-Level Clinical Question Answering Over Heterogeneous Electronic Health Records Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Clinical decision-making increasingly relies on timely and context-aware access to patient information within Electronic Health Records (EHRs), yet most existing natural language question-answering (QA) systems are evaluated solely on benchmark datasets, limiting their practical relevance. To overcome this limitation, we introduce EHRNavigator, a multi-agent framework that harnesses AI agents to perform patient-level question answering across heterogeneous and multimodal EHR data. |
LINGFEI QIAN et. al. | arxiv-cs.CL | 2026-01-14 |
| 133 | STAGE: A Benchmark for Knowledge Graph Construction, Question Answering, and In-Script Role-Playing Over Movie Screenplays Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce STAGE (Screenplay Text, Agents, Graphs and Evaluation), a unified benchmark for narrative understanding over full-length movie screenplays. |
QIUYU TIAN et. al. | arxiv-cs.CL | 2026-01-13 |
| 134 | Beyond Structured Knowledge: Performance Boundaries of ChatGPT in Geological-hazard Question Answering and The Need for Human-in-the-loop Oversight Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conduct a version-specific evaluation of ChatGPT-4o for geological-hazard question answering using a transparent, rubric-based design. |
SAIER WU et. al. | Frontiers in Earth Science | 2026-01-13 |
| 135 | Enhancing Large Language Models for Knowledge Graph Question Answering Via Multi-granularity Knowledge Injection and Structured Reasoning Path-augmented Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chuanyang Gong; Zhihua Wei; Wenhao Tao; Duoqian Miao; | Information Processing & Management | 2026-01-13 |
| 136 | Exploring The Meta-level Reasoning of Large Language Models Via A Tool-based Multi-hop Tabular Question Answering Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We take a more structured approach, distinguishing meta-level reasoning (denoting the process of reasoning about intermediate steps required to solve a task) from object-level reasoning (which concerns the low-level execution of the aforementioned steps.) |
Nick Ferguson; Alan Bundy; Kwabena Nuamah; | arxiv-cs.CL | 2026-01-12 |
| 137 | Judging Against The Reference: Uncovering Knowledge-Driven Failures in LLM-Judges on QA Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We identify a critical failure mode of such reference-based LLM QA evaluation: when the provided reference conflicts with the judge model’s parametric knowledge, the resulting scores become unreliable, substantially degrading evaluation fidelity. To study this phenomenon systematically, we introduce a controlled swapped-reference QA framework that induces reference-belief conflicts. |
DONGRYEOL LEE et. al. | arxiv-cs.CL | 2026-01-12 |
| 138 | Fine-Tuning Vs. RAG for Multi-Hop Question Answering with Novel Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we systematically compare parametric and non-parametric knowledge injection methods for open-domain multi-hop question answering. |
Zhuoyi Yang; Yurun Song; Iftekhar Ahmed; Ian Harris; | arxiv-cs.CL | 2026-01-11 |
| 139 | UETQuintet at BioCreative IX – MedHopQA: Enhancing Biomedical QA with Selective Multi-hop Reasoning and Contextual Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a model designed to effectively address both direct and sequential questions. |
Quoc-An Nguyen; Thi-Minh-Thu Vu; Bich-Dat Nguyen; Dinh-Quang-Minh Tran; Hoang-Quynh Le; | arxiv-cs.CL | 2026-01-11 |
| 140 | FinCARDS: Card-Based Analyst Reranking for Financial Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose FinCards, a structured reranking framework that reframes financial evidence selection as constraint satisfaction under a finance-aware schema. |
YIXI ZHOU et. al. | arxiv-cs.IR | 2026-01-11 |
| 141 | Efficient Visual Question Answering Pipeline for Autonomous Driving Via Scene Region Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This focus limits their practical deployment in real-time autonomous driving scenarios. To tackle this issue, we propose an efficient VLM framework for autonomous driving VQA tasks, SRC-Pipeline. |
Yuliang Cai; Dongqiangzi Ye; Zitian Chen; Chongruo Wu; | arxiv-cs.CV | 2026-01-11 |
| 142 | N2N-GQA: Noise-to-Narrative for Graph-Based Table-Text Question Answering Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our key insight is that multi-hop reasoning requires understanding relationships between evidence pieces: by modeling documents as graph nodes with semantic relationships as edges, we identify bridge documents connecting reasoning steps, a capability absent in list-based retrieval. |
Mohamed Sharafath; Aravindh Annamalai; Ganesh Murugan; Aravindakumar Venugopalan; | arxiv-cs.CL | 2026-01-10 |
| 143 | Do Language Models Reason Across Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a simple two-hop question answering setting, where answering a question requires making inferences over two multilingual documents. |
Yan Meng; Wafaa Mohammed; Christof Monz; | arxiv-cs.CL | 2026-01-10 |
| 144 | Extracting Structured Data from Unstructured Breast Imaging Reports with Transformer-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study compared the performance of BERT-based and generative language models in converting unstructured breast imaging reports into structured, tabular data suitable for clinical and research applications. |
Mikel Carrilero-Mardones; Jorge Pérez-Martín; Francisco Javier Díez; Iñigo Bermejo Delgado; | Frontiers in Digital Health | 2026-01-09 |
| 145 | A Lightweight and Explainable Vision-Language Framework for Crop Disease Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents a lightweight vision-language framework for crop and disease identification from leaf images. |
Md. Zahid Hossain; Most. Sharmin Sultana Samu; Md. Rakibul Islam; Md. Siam Ansary; | arxiv-cs.CV | 2026-01-08 |
| 146 | From National Curricula to Cultural Awareness: Constructing Open-Ended Culture-Specific Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce CuCu, an automated multi-agent LLM framework that transforms national textbook curricula into open-ended, culture-specific question-answer pairs. |
Haneul Yoo; Won Ik Cho; Geunhye Kim; Jiyoon Han; | arxiv-cs.CL | 2026-01-08 |
| 147 | DisastQA: A Comprehensive Benchmark for Evaluating Question Answering in Disaster Management Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DisastQA, a large-scale benchmark of 3,000 rigorously verified questions (2,000 multiple-choice and 1,000 open-ended) spanning eight disaster types. |
ZHITONG CHEN et. al. | arxiv-cs.CL | 2026-01-07 |
| 148 | Self-MedRAG: A Self-Reflective Hybrid Retrieval-Augmented Generation Framework for Reliable Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While Retrieval-Augmented Generation (RAG) mitigates these issues by incorporating external knowledge, conventional single-shot retrieval often fails to resolve complex biomedical queries requiring multi-step inference. To address this, we propose Self-MedRAG, a self-reflective hybrid framework designed to mimic the iterative hypothesis-verification process of clinical reasoning. |
Jessica Ryan; Alexander I. Gumilang; Robert Wiliam; Derwin Suhartono; | arxiv-cs.IR | 2026-01-07 |
| 149 | From Chains to Graphs: Self-Structured Reasoning for General-Domain LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Self-Graph Reasoning (SGR), a framework that enables LLMs to explicitly represent their reasoning process as a structured graph before producing the final answer. |
YINGJIAN CHEN et. al. | arxiv-cs.CL | 2026-01-07 |
| 150 | When Models Decide and When They Bind: A Two-Stage Computation for Multiple-Choice Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Multiple-choice question answering (MCQA) is easy to evaluate but adds a meta-task: models must both solve the problem and output the symbol that *represents* the answer, conflating reasoning errors with symbol-binding failures. |
Hugh Mee Wong; Rick Nouwen; Albert Gatt; | arxiv-cs.CL | 2026-01-07 |
| 151 | EpiQAL: Benchmarking Large Language Models in Epidemiological Question Answering for Enhanced Alignment and Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present EpiQAL, the first diagnostic benchmark for epidemiological question answering across diverse diseases, comprising three subsets built from open-access literature. |
MINGYANG WEI et. al. | arxiv-cs.CL | 2026-01-06 |
| 152 | SentGraph: Hierarchical Sentence Graph for Multi-hop Retrieval-Augmented Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing chunk-based retrieval often provides irrelevant and logically incoherent context, leading to incomplete evidence chains and incorrect reasoning during answer generation. To address these challenges, we propose SentGraph, a sentence-level graph-based RAG framework that explicitly models fine-grained logical relationships between sentences for multi-hop question answering. |
JUNLI LIANG et. al. | arxiv-cs.CL | 2026-01-06 |
| 153 | LittiChoQA: Literary Texts in Indic Languages Chosen for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We address the scarcity of long-context QA resources for Indic languages by introducing LittiChoQA, the largest literary QA dataset to date covering many languages spoken in the Gangetic plains of India. |
Aarya Khandelwal; Ritwik Mishra; Rajiv Ratn Shah; | arxiv-cs.CL | 2026-01-06 |
| 154 | DeCode: Decoupling Content and Delivery for Medical QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce DeCode, a training-free, model-agnostic framework that adapts existing LLMs to produce contextualized answers in clinical settings. |
Po-Jen Ko; Chen-Han Tsai; Yu-Shao Peng; | arxiv-cs.CL | 2026-01-05 |
| 155 | Crop GraphRAG: Pest and Disease Knowledge Base Q&A System for Sustainable Crop Protection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By mitigating the limitations of large language models in specialized agricultural contexts, this study provides a pragmatic tool for intelligent QA in the agricultural domain and advances the application of AI in crop protection. |
HAO WU et. al. | Frontiers in Plant Science | 2026-01-05 |
| 156 | MacVQA: Adaptive Memory Allocation and Global Noise Filtering for Continual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current methods often struggle with balancing knowledge retention, adaptation, and robust feature representation. To address these challenges, we propose a novel framework with adaptive memory allocation and global noise filtering called MacVQA for visual question answering. |
ZHIFEI LI et. al. | arxiv-cs.CV | 2026-01-05 |
| 157 | MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Building upon our prior MDAgent, we present MDAgent2, the first end-to-end framework capable of performing both knowledge Q&A and code generation within the MD domain. |
ZHUOFAN SHI et. al. | arxiv-cs.CE | 2026-01-05 |
| 158 | When Do Tools and Planning Help LLMs Think? A Cost- and Latency-Aware Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Modern large language models (LLMs) increasingly rely on inference-time planning and external tools to improve reasoning. We benchmark this behavior on two real-world settings: event-centric question answering over graph-structured knowledge (Event-QA) and persuasive response generation in Reddit ChangeMyView (CMV). |
Subha Ghoshal; Ali Al-Bustami; | arxiv-cs.CL | 2026-01-05 |
| 159 | Adversarial Question Answering Robustness: A Multi-Level Error Analysis and Mitigation Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We perform comprehensive multi-level error analysis using five complementary categorization schemes, identifying negation confusion and entity substitution as the primary failure modes. |
Agniv Roy Choudhury; Vignesh Ponselvan Rajasingh; | arxiv-cs.CL | 2026-01-05 |
| 160 | Question Answering for Multi-Release Systems: A Case Study at Ciena Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Motivated by the observed inaccuracy of state-of-the-art question-answering techniques on multi-release system documents, we propose QAMR, a chatbot designed to answer questions across multi-release system documentation. |
PARHAM KHAMSEPOUR et. al. | arxiv-cs.SE | 2026-01-05 |
| 161 | Augmenting Medical Visual Question Answering with Mixup, Label Smoothing, and Layer-wise Relevance Propagation EXplainable Artificial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, an imbalance in the number and distribution of image and Question–Answer (QA) pairs poses challenges for developing robust models. This study proposes improving existing MVQA datasets using data augmentation techniques specifically Mixup and Label Smoothing—to address this issue. |
Sheerin Sitara Noor Mohamed; Kavitha Srinivasan; | PeerJ Computer Science | 2026-01-05 |
| 162 | PdfQA: Diverse, Challenging, and Realistic Question Answering Over PDFs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present pdfQA, a multi-domain 2K human-annotated (real-pdfQA) and 2K synthetic dataset (syn-pdfQA) differentiating QA pairs in ten complexity dimensions (e.g., file type, source modality, source position, answer type). |
TOBIAS SCHIMANSKI et. al. | arxiv-cs.CL | 2026-01-05 |
| 163 | CTIS-QA: Clinical Template-Informed Slide-level Question Answering for Pathology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a clinical diagnosis template-based pipeline to systematically collect and structure pathological information. |
HAO LU et. al. | arxiv-cs.CV | 2026-01-04 |
| 164 | Reinforcement Learning Enhanced Multi-hop Reasoning for Temporal Knowledge Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, at each hop, large language models (LLMs) retrieve subgraphs with numerous temporally similar and semantically complex relations, increasing the risk of suboptimal decisions and error propagation. To address these challenges, we propose the multi-hop reasoning enhanced (MRE) framework, which enhances both forward and backward reasoning to improve the identification of globally optimal reasoning trajectories. |
Wuzhenghong Wen; Chao Xue; Su Pan; Yuwei Sun; Minlong Peng; | arxiv-cs.AI | 2026-01-03 |
| 165 | Semantic Event Graphs for Long-Form Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Semantic Event Graphs (SEG), a lightweight symbolic interface between video and language that replaces raw frames with compact temporal interaction logs. |
Aradhya Dixit; Tianxi Liang; | arxiv-cs.CV | 2026-01-01 |
| 166 | Retrieval–Reasoning Processes for Multi-hop Question Answering: A Four-Axis Design Framework and Empirical Trends Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey takes the execution procedure as the unit of analysis and introduces a four-axis framework covering (A) overall execution plan, (B) index structure, (C) next-step control (strategies and triggers), and (D) stop/continue criteria. |
Yuelyu Ji; Zhuochun Li; Rui Meng; Daqing He; | arxiv-cs.CL | 2026-01-01 |
| 167 | Enhancing The QA Model Through A Multi-domain Debiasing Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By identifying errors related to lexical bias, numerical reasoning, and entity recognition, we develop a multi-domain debiasing framework incorporating knowledge distillation, debiasing techniques, and domain expansion. |
Yuefeng Wang; ChangJae Lee; | arxiv-cs.CL | 2026-01-01 |
| 168 | Explicit Abstention Knobs for Predictable Reliability in Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate whether confidence-based abstention provides reliable control over error rates in video question answering, and whether that control remains robust under distribution shift. |
Jorge Ortiz; | arxiv-cs.AI | 2025-12-31 |
| 169 | Intelligent Diabetes Question Answering System Based on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
步翔 徐; | Artificial Intelligence and Robotics Research | 2025-12-31 |
| 170 | Comparative Analysis of The Risk of Hadith Errors in Question-Answering Systems Based on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study aims to conduct a comparative analysis of the risk of errors generated by Large Language Model-based Question-Answering systems in answering hadith-related questions. |
Hakkun Elmunsyah; | International Journal of Research and Scientific Innovation | 2025-12-30 |
| 171 | DermaVQA-DAS: Dermatology Assessment Schema (DAS) & Datasets for Closed-Ended Question Answering & Segmentation in Patient-Generated Dermatology Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent advances in dermatological image analysis have been driven by large-scale annotated datasets; however, most existing benchmarks focus on dermatoscopic images and lack patient-authored queries and clinical context, limiting their applicability to patient-centered care. To address this gap, we introduce DermaVQA-DAS, an extension of the DermaVQA dataset that supports two complementary tasks: closed-ended question answering (QA) and dermatological lesion segmentation. |
WEN-WAI YIM et. al. | arxiv-cs.CV | 2025-12-30 |
| 172 | Improving Few-Shot Change Detection Visual Question Answering Via Decision-Ambiguity-guided Reinforcement Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose DARFT, a Decision-Ambiguity-guided Reinforcement Fine-Tuning framework that first mines DAS using an SFT-trained reference policy and then applies group-relative policy optimization on the mined subset. |
FUYU DONG et. al. | arxiv-cs.CV | 2025-12-30 |
| 173 | HaluNet: Multi-Granular Uncertainty Modeling for Efficient Hallucination Detection in LLM Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present \textbf{HaluNet}, a lightweight and trainable neural framework that integrates multi granular token level uncertainties by combining semantic embeddings with probabilistic confidence and distributional uncertainty. |
CHAODONG TONG et. al. | arxiv-cs.CL | 2025-12-30 |
| 174 | Integrating Domain Knowledge for Financial QA: A Multi-Retriever RAG Approach with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through comprehensive ablation experiments and error analysis, we find that domain-specific training with the SecBERT encoder significantly contributes to our best neural symbolic model surpassing the FinQA paper’s top model, which serves as our baseline. |
Yukun Zhang; Stefan Elbl Droguett; Samyak Jain; | arxiv-cs.CL | 2025-12-29 |
| 175 | Retrieval Augmented Question Answering: When Should LLMs Admit Ignorance? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate the use of LLMs for retrieval augmented question answering. |
Dingmin Wang; Ji Ma; Shankar Kumar; | arxiv-cs.CL | 2025-12-29 |
| 176 | EdgeJury: Cross-Reviewed Small-Model Ensembles for Truthful Question Answering on Serverless Edge Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present EdgeJury, a lightweight ensemble framework that improves truthfulness and robustness using only small instruction-tuned language models (3B-8B) suitable for serverless edge inference. |
Aayush Kumar; | arxiv-cs.LG | 2025-12-29 |
| 177 | Chain-of-thought Reviewing and Correction for Time Series Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Different from purely textual tasks, time series data are inherently verifiable, enabling consistency checking between reasoning steps and the original input. Motivated by this property, we propose T3LLM, which performs multi-step reasoning with an explicit correction mechanism for time series question answering. |
Chen Su; Yuanhe Tian; Yan Song; | arxiv-cs.CL | 2025-12-27 |
| 178 | Security Architecture of Smart Grid Knowledge Question-answering Systems Based on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yun Fu; Junrong Liu; Qiucheng Ban; Linyan Zhou; | Cyber-Physical Systems | 2025-12-26 |
| 179 | Uncertainty-Aware Dynamic Knowledge Graphs for Reliable Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a demonstration of uncertainty-aware dynamic KGs, a framework that combines (i) dynamic construction of evolving KGs, (ii) confidence scoring and uncertainty-aware retrieval, and (iii) an interactive interface for reliable and interpretable QA. |
YU TAKAHASHI et. al. | arxiv-cs.CL | 2025-12-26 |
| 180 | KG20C & KG20C-QA: Scholarly Knowledge Graph Benchmarks for Link Prediction and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present KG20C and KG20C-QA, two curated datasets for advancing question answering (QA) research on scholarly data. |
Hung-Nghiep Tran; Atsuhiro Takasu; | arxiv-cs.IR | 2025-12-25 |
| 181 | Streaming Video Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Streamo, a real-time streaming video LLM that serves as a general-purpose interactive assistant. |
Jiaer Xia; Peixian Chen; Mengdan Zhang; Xing Sun; Kaiyang Zhou; | arxiv-cs.CV | 2025-12-24 |
| 182 | MAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, continual reflections of the same LLM onto itself exhibit degeneration of thought, where the LLM continues to repeat the same errors again and again even with the knowledge that its wrong. To address this problem, we instead introduce multi-agent with multi-persona debators as the method to generate reflections. |
ONAT OZER et. al. | arxiv-cs.AI | 2025-12-23 |
| 183 | CycleChart: A Unified Consistency-Based Learning Framework for Bidirectional Chart Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce CycleChart, a consistency-based learning framework for bidirectional chart understanding and generation. |
Dazhen Deng; Sen Yang; Yuchen He; Yuan Tian; Yingcai Wu; | arxiv-cs.CL | 2025-12-22 |
| 184 | Toward Ethical AI Through Bayesian Uncertainty in Neural Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We explore Bayesian reasoning as a means to quantify uncertainty in neural networks for question answering. |
Riccardo Di Sipio; | arxiv-cs.CL | 2025-12-19 |
| 185 | Video Detective: Seek Critical Clues Recurrently to Answer Question from Long Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In fact, when answering given questions, only a small amount of crucial information is required. Therefore, we propose an efficient question-aware memory mechanism, enabling MLLMs to recurrently seek these critical clues. |
Henghui Du; Chang Zhou; Chunjie Zhang; Xi Chen; Di Hu; | arxiv-cs.CV | 2025-12-18 |
| 186 | Evaluating The Capability of Video Question Generation for Expert Knowledge Elicitation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For a continuous improvement of VQG models, we propose a protocol that evaluates the ability by simulating question-answering communication with experts using a question-to-answer retrieval. |
Huaying Zhang; Atsushi Hashimoto; Tosho Hirasawa; | arxiv-cs.CV | 2025-12-16 |
| 187 | Improving VQA Reliability: A Dual-Assessment Approach with Self-Reflection and Cross-Model Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the susceptibility of VLMs to hallucinations can lead to overconfident yet incorrect answers, severely undermining answer reliability. To address this, we propose Dual-Assessment for VLM Reliability (DAVR), a novel framework that integrates Self-Reflection and Cross-Model Verification for comprehensive uncertainty estimation. |
XIXIAN WU et. al. | arxiv-cs.CV | 2025-12-16 |
| 188 | Code-Driven LLM Agent for One-Shot Explanatory Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose OneCoLA (One-shot and training-free Code-driven LLM Agent), a novel framework for Multimodal Explanatory Visual Question Answering (MEVQA). |
Zuyi Zhou; Dizhan Xue; Baoyuan Qi; Shengsheng Qian; Changsheng Xu; | ACM Transactions on Multimedia Computing, Communications, … | 2025-12-16 |
| 189 | Parameter Efficient Multimodal Instruction Tuning for Romanian Vision Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we contribute to reducing the multimodal NLP resource gap for Romanian. |
George-Andrei Dima; Dumitru-Clementin Cercel; | arxiv-cs.CL | 2025-12-16 |
| 190 | An Open and Reproducible Deep Research Agent for Long-Form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present an open deep research system for long-form question answering, selected as a winning system in the text-to-text track of the MMU-RAG competition at NeurIPS 2025. |
IKUYA YAMADA et. al. | arxiv-cs.CL | 2025-12-15 |
| 191 | Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Visually Grounded Active View Selection (VG-AVS), a task that selects the most informative next viewpoint using only the visual information in the current image, without relying on scene memory or external knowledge. |
Juil Koo; Daehyeon Choi; Sangwoo Youn; Phillip Y. Lee; Minhyuk Sung; | arxiv-cs.CV | 2025-12-15 |
| 192 | KFS-Bench: Comprehensive Evaluation of Key Frame Sampling in Long Video Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose KFS-Bench, the first benchmark for key frame sampling in long video question answering (QA), featuring multi-scene annotations to enable direct and robust evaluation of sampling strategies. |
Zongyao Li; Kengo Ishida; Satoshi Yamazaki; Xiaotong Ji; Jianquan Liu; | arxiv-cs.CV | 2025-12-15 |
| 193 | Hybrid Retrieval-Augmented Generation for Robust Multilingual Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop and evaluate a multilingual Retrieval-Augmented Generation pipeline specifically designed for question answering on noisy historical documents. |
Anthony Mudet; Souhail Bakkali; | arxiv-cs.DL | 2025-12-14 |
| 194 | Time-Aware Complex Question Answering Over Temporal Knowledge Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Luyi Bai; Tongyue Zhang; Guangchen Feng; | Data Knowl. Eng. | |
| 195 | ViInfographicVQA: A Benchmark for Single and Multi-image Visual Question Answering on Vietnamese Infographics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ViInfographicVQA, the first benchmark for Vietnamese InfographicVQA, comprising over 6747 real-world infographics and 20409 human-verified question-answer pairs across economics, healthcare, education, and more. |
TUE-THU VAN-DINH et. al. | arxiv-cs.CV | 2025-12-13 |
| 196 | Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking By Contrasting Layers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing RAG methods for simple and multi-hop question answering (QA) are still prone to incorrect retrievals and hallucinations. To address these limitations, we propose CoopRAG, a novel RAG framework for the question answering task in which a retriever and an LLM work cooperatively with each other by exchanging informative knowledge, and the earlier and later layers of the retriever model work cooperatively with each other to accurately rank the retrieved documents relevant to a given query. |
Youmin Ko; Sungjong Seo; Hyunjoon Kim; | arxiv-cs.CL | 2025-12-11 |
| 197 | MedBioRAG: Semantic Search and Retrieval-Augmented Generation with Large Language Models for Medical and Biological QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce MedBioRAG, a retrieval-augmented model designed to improve biomedical QA performance through a combination of semantic and lexical search, document retrieval, and supervised fine-tuning. |
Seonok Kim; | arxiv-cs.CL | 2025-12-10 |
| 198 | DIANGPT: A Review on A Domain-Specific Question Answering System Using LORA and RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This review evaluates DianGPT’s architectural pipeline, including dataset processing, supervised fine-tuning, retrieval workflows, and automated evaluation strate- gies. |
Pallavi SK; Pradeep Nayak; Nischitha Nischitha; Omkar KS; Omkar JK; | International Journal of Scientific Research in Engineering … | 2025-12-10 |
| 199 | SimpleDevQA: Benchmarking Large Language Models on Development Knowledge QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through this pipeline, we introduce SimpleDevQA, a multilingual benchmark derived from real user dialogues. |
JING ZHANG et. al. | arxiv-cs.SE | 2025-12-09 |
| 200 | ClinicalTrialsHub: Bridging Registries and Literature for Comprehensive Clinical Trial Access Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ClinicalTrialsHub, an interactive search-focused platform that consolidates all data from ClinicalTrials.gov and augments it by automatically extracting and structuring trial-relevant information from PubMed research articles. |
JIWOO PARK et. al. | arxiv-cs.CL | 2025-12-08 |
| 201 | Instant Question Answering System for Pdf Documents Using Local Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes PDF-Insight 360, a fully local, open-source Retrieval-Augmented Generation (RAG) framework that enables instant, accurate and privacy-preserving natural-language question answering on arbitrary PDF documents using only consumer-grade hardware. |
P. B. Khandekar; | International Journal of Scientific Research in Engineering … | 2025-12-08 |
| 202 | BoundingDocs: A Unified Dataset for Document Question Answering with Spatial Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abstract We present a unified dataset for document Question-Answering (QA), which is obtained combining several public datasets related to Document AI and visually rich document understanding (VRDU). |
Simone Giovannini; Fabio Coppini; Andrea Gemelli; Simone Marinai; | International Journal on Document Analysis and Recognition … | 2025-12-06 |
| 203 | Knowing What’s Missing: Assessing Information Sufficiency in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a structured Identify-then-Verify framework for robust sufficiency modeling. |
Akriti Jain; Aparna Garimella; | arxiv-cs.CL | 2025-12-06 |
| 204 | Modeling Contextual Passage Utility for Multihop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a lightweight approach to model contextual passage utility, accounting for inter-passage dependencies. |
Akriti Jain; Aparna Garimella; | arxiv-cs.CL | 2025-12-06 |
| 205 | Optimizing Medical Question-Answering Systems: A Comparative Study of Fine-Tuned and Zero-Shot Large Language Models with RAG Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a retrieval-augmented generation (RAG) based medical QA system that combines domain-specific knowledge retrieval with open-source LLMs to answer medical questions. |
TASNIMUL HASSAN et. al. | arxiv-cs.CL | 2025-12-05 |
| 206 | Collective Narrative Grounding: Community-Coordinated Data Contributions to Improve Local AI Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language model (LLM) question-answering systems often fail on community-specific queries, creating knowledge blind spots that marginalize local voices and reinforce epistemic injustice. We present Collective Narrative Grounding, a participatory protocol that transforms community stories into structured narrative units and integrates them into AI systems under community governance. |
Zihan Gao; Mohsin Y. K. Yousufi; Jacob Thebault-Spieker; | arxiv-cs.CL | 2025-12-05 |
| 207 | ArtistMus: A Globally Diverse, Artist-Centric Benchmark for Retrieval-Augmented Music Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MusWikiDB, a vector database of 3.2M passages from 144K music-related Wikipedia pages, and ArtistMus, a benchmark of 1,000 questions on 500 diverse artists with metadata such as genre, debut year, and topic. |
Daeyong Kwon; SeungHeon Doh; Juhan Nam; | arxiv-cs.CL | 2025-12-05 |
| 208 | Grounded Multilingual Medical Reasoning for Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a method to generate multilingual reasoning traces grounded in factual medical knowledge. |
Pietro Ferrazzi; Aitor Soroa; Rodrigo Agerri; | arxiv-cs.CL | 2025-12-05 |
| 209 | PathFinder: MCTS and LLM Feedback-based Path Selection for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, we propose PATHFINDER, an approach that: (i) uses Monte Carlo Tree Search to generate training path traces, (ii) improves training data quality by filtering erroneous and lengthy traces using sub-answer recall and LLM-as-a-judge verification, and (iii) reformulates sub-queries to handle failed retrieval cases. |
Durga Prasad Maram; Kalpa Gunaratna; Vijay Srinivasan; Haris Jeelani; Srinivas Chappidi; | arxiv-cs.LG | 2025-12-04 |
| 210 | Fine-Tuning BERT for Domain-Specific Question Answering: Toward Educational NLP Resources at University Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we developed a chatbot for the University of Limerick’s Department of Electronic and Computer Engineering to provide course information to students. |
Aurélie Montfrond; | arxiv-cs.CL | 2025-12-04 |
| 211 | Quantum Physics Intelligent Question Answering (Q&A) System Based on Retrieval‐Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Overall, this work contributes both a robust technical solution and a much‐needed benchmark to advance research in AI‐powered education and intelligent question answering within the field of quantum physics. |
Wenchen Li; Su Lu; Hongqi Zhu; Peijun Wu; Wuhe Zou; | Concurrency and Computation: Practice and Experience | 2025-12-03 |
| 212 | CryptoQA: A Large-scale Question-answering Dataset for AI-assisted Cryptography Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their ability to perform deep reasoning and mathematical analysis, particularly for complex tasks as required in cryptography, remains poorly understood, largely due to the lack of suitable data for evaluation and training. To address this gap, we present CryptoQA, the first large-scale question-answering (QA) dataset specifically designed for cryptography. |
MAYAR ELFARES et. al. | arxiv-cs.CR | 2025-12-02 |
| 213 | BookRAG: A Hierarchical Structure-aware Index-based Approach for Retrieval-Augmented Generation on Complex Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing RAG approaches often focus on general documents, and they overlook the fact that many real-world documents (such as books, booklets, handbooks, etc.) have a hierarchical structure, which organizes their content from different granularity levels, leading to poor performance for the QA task. To address these limitations, we introduce BookRAG, a novel RAG approach targeted for documents with a hierarchical structure, which exploits logical hierarchies and traces entity relations to query the highly relevant information. |
Shu Wang; Yingli Zhou; Yixiang Fang; | arxiv-cs.IR | 2025-12-02 |
| 214 | Retrieving on A Topic Graph for Long Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SHIWEI CHEN et. al. | Neurocomputing | 2025-12-01 |
| 215 | CAIRNS: Balancing Readability and Scientific Accuracy in Climate Adaptation Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Climate Adaptation question-answering with Improved Readability and Noted Sources (CAIRNS), a framework that enables experts — farmer advisors — to obtain credible preliminary answers from complex evidence sources from the web. |
Liangji Kong; Aditya Joshi; Sarvnaz Karimi; | arxiv-cs.CL | 2025-12-01 |
| 216 | Memory-Augmented Knowledge Fusion with Safety-Aware Decoding for Domain-Adaptive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce Knowledge-Aware Reasoning and Memory-Augmented Adaptation (KARMA), a novel framework designed to enhance QA performance in care scenarios. |
Lei Fu; Xiang Chen; Kaige Gao Xinyue Huang; Kejian Tong; | arxiv-cs.CL | 2025-12-01 |
| 217 | HanDyVQA: A Video QA Benchmark for Fine-Grained Hand-Object Interaction Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce HanDyVQA, a fine-grained video question-answering benchmark that comprehensively covers both the manipulation and effect aspects of HOI. |
Masatoshi Tateno; Gido Kato; Hirokatsu Kataoka; Yoichi Sato; Takuma Yagi; | arxiv-cs.CV | 2025-11-30 |
| 218 | Machine Learning and Deep Learning Techniques in Arabic Question Answering Systems: Innovations and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, we recognize that the impact of AQAS extends beyond academia; it has significant implications for various sectors, including education, technology, and information access. Through this comprehensive examination, we aim to lay the groundwork for ongoing innovation and development in AQAS. |
Azza Mohamed; Khaled Abdelqader; Khaled Shaalan; | PeerJ Computer Science | 2025-11-28 |
| 219 | Tourism Question Answer System in Indian Language Using Domain-Adapted Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, a dataset comprising 7,715 Hindi QA pairs pertaining to Varanasi tourism was constructed and subsequently augmented with 27,455 pairs generated via Llama zero-shot prompting. |
Praveen Gatla; Nikita Kanwar; Gouri Sahoo; Rajesh Kumar Mundotiya; | arxiv-cs.CL | 2025-11-28 |
| 220 | Multi-Modal Scene Graph with Kolmogorov-Arnold Experts for Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Multi-Modal Scene Graph with Kolmogorov-Arnold Expert Network for Audio-Visual Question Answering (SHRIKE). |
Zijian Fu; Changsheng Lv; Mengshi Qi; Huadong Ma; | arxiv-cs.AI | 2025-11-28 |
| 221 | ORCA: Open-ended Response Correctness Assessment for Audio Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ORCA (Open-ended Response Correctness Assessment), a framework that models the variability in human judgments using Beta distributions to predict both expected correctness and uncertainty. |
ŠIMON SEDLÁČEK et. al. | arxiv-cs.SD | 2025-11-28 |
| 222 | WearVQA: A Visual Question Answering Benchmark for Wearables in Egocentric Authentic Real-world Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce WearVQA, the first benchmark specifically designed to evaluate the Visual Question Answering (VQA) capabilities of multi-model AI assistant on wearable devices like smart glasses. |
EUN CHANG et. al. | arxiv-cs.AI | 2025-11-27 |
| 223 | JBE-QA: Japanese Bar Exam QA Dataset for Assessing Legal Domain Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce JBE-QA, a Japanese Bar Exam Question-Answering dataset to evaluate large language models’ legal knowledge. |
ZHIHAN CAO et. al. | arxiv-cs.CL | 2025-11-27 |
| 224 | KA-RAG: Integrating Knowledge Graphs and Agentic Retrieval-Augmented Generation for An Intelligent Educational Question-Answering Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Generative artificial intelligence (AI) and large language models (LLMs) are reshaping the landscape of intelligent educational systems; however, existing solutions often suffer from unstructured resource organization, limited interpretability, and suboptimal retrieval precision. To address these challenges, this study introduces KA-RAG, a course-oriented question answering (QA) framework that integrates a structured Knowledge Graph (KG) with an Agentic Retrieval-Augmented Generation (Agentic-RAG) workflow. |
Fangqun Gao; Shu Xu; Weiyan Hao; Tao Lu; | Applied Sciences | 2025-11-26 |
| 225 | Progressive Knowledge Distillation and Numerical Reasoning Enhancement for Financial Report Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our method introduces a difficulty-aware curriculum learning strategy that organizes training into two progressive stages, facilitating more effective and stable model learning. |
RUONAN FANG et. al. | Electronics | 2025-11-26 |
| 226 | Chatty-KG: A Multi-Agent AI System for On-Demand Conversational Question Answering Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional KGQA systems preserve structure but typically support only single-turn QA, incur high latency, and struggle with coreference and context tracking. To address these limitations, we propose Chatty-KG, a modular multi-agent system for conversational QA over KGs. |
REHAM OMAR et. al. | arxiv-cs.CL | 2025-11-25 |
| 227 | ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ENACT, a benchmark that casts evaluation of embodied cognition as world modeling from egocentric interaction in a visual question answering (VQA) format. |
QINENG WANG et. al. | arxiv-cs.AI | 2025-11-25 |
| 228 | GHR-VQA: Graph-guided Hierarchical Relational Reasoning for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose GHR-VQA, Graph-guided Hierarchical Relational Reasoning for Video Question Answering (Video QA), a novel human-centric framework that incorporates scene graphs to capture intricate human-object interactions within video sequences. |
Dionysia Danai Brilli; Dimitrios Mallis; Vassilis Pitsikalis; Petros Maragos; | arxiv-cs.CV | 2025-11-25 |
| 229 | SFA: Scan, Focus, and Amplify Toward Guidance-aware Answering for Video TextVQA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, the model must identify question-relevant textual cues and filter out redundant or irrelevant information to ensure answering is guided by the most relevant and informative cues. To address these challenges, we propose SFA, a training-free framework and the first Video-LLM-based method tailored for Video TextVQA, motivated by the human process of answering questions. |
HAIBIN HE et. al. | arxiv-cs.CV | 2025-11-25 |
| 230 | EAGER: Edge-Aligned LLM Defense for Robust, Efficient, and Accurate Cybersecurity Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present EAGER, an edge-aligned defense framework that integrates parameter-efficient quantization with domain-specific preference alignment to jointly optimize efficiency, robustness, and accuracy. |
Onat Gungor; Roshan Sood; Jiasheng Zhou; Tajana Rosing; | arxiv-cs.CR | 2025-11-24 |
| 231 | VeriSciQA: An Auto-Verified Dataset for Scientific Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although recent work uses LVLMs to synthesize data at scale, we identify systematic errors in their resulting QA pairs, stemming from LVLMs’ inherent limitations and information asymmetry between figures and text. To address these challenges, we propose a verification-centric Generate-then-Verify framework that first generates QA pairs with figure-associated textual context, then applies cross-modal consistency checks against figures along with auxiliary filters to eliminate erroneous pairs. |
Yuyi Li; Daoyuan Chen; Zhen Wang; Yutong Lu; Yaliang Li; | arxiv-cs.CV | 2025-11-24 |
| 232 | IndEgo: A Dataset of Industrial Scenarios and Collaborative Work for Egocentric Assistants Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce IndEgo, a multimodal egocentric and exocentric dataset addressing common industrial tasks, including assembly/disassembly, logistics and organisation, inspection and repair, woodworking, and others. |
VIVEK CHAVAN et. al. | arxiv-cs.CV | 2025-11-24 |
| 233 | Are Large Vision Language Models Truly Grounded in Medical Images? Evidence from Italian Clinical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large vision language models (VLMs) have achieved impressive performance on medical visual question answering benchmarks, yet their reliance on visual information remains unclear. |
FEDERICO FELIZZI et. al. | arxiv-cs.CV | 2025-11-24 |
| 234 | Thinking Ahead: Foresight Intelligence in MLLMs and World Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we define Foresight Intelligence as the capability to anticipate and interpret future events-an ability essential for applications such as autonomous driving, yet largely overlooked by existing research. |
ZHANTAO GONG et. al. | arxiv-cs.CV | 2025-11-23 |
| 235 | ChineseVideoBench: Benchmarking Multi-modal Large Models for Chinese Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces ChineseVideoBench, a pioneering benchmark specifically designed for evaluating Multimodal Large Language Models (MLLMs) in Chinese Video Question Answering. |
YUXIANG NIE et. al. | arxiv-cs.CV | 2025-11-23 |
| 236 | MedPerturbing LLMs: A Comparative Study of Toxicity, Prompt Tuning, and Jailbreaks in Medical QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we evaluate the toxicity of widely used general-purpose LLMs in medical question–answering tasks. |
Arash Asgari; Amirreza Naziri; Laleh Seyyed-Kalantari; | Proceedings of the AAAI Symposium Series | 2025-11-23 |
| 237 | Measuring The Impact of Lexical Training Data Coverage on Hallucination Detection in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a complementary question: Does lexical training-data coverage of the question and/or generated answer provide additional signal for hallucination detection? |
Shuo Zhang; Fabrizio Gotti; Fengran Mo; Jian-Yun Nie; | arxiv-cs.CL | 2025-11-22 |
| 238 | RoadSceneVQA: Benchmarking Visual Question Answering in Roadside Perception Systems for Intelligent Transportation System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on the above, we propose the baseline model RoadMind. |
RUNWEI GUAN et. al. | arxiv-cs.CV | 2025-11-22 |
| 239 | MGA-VQA: Secure and Interpretable Graph-Augmented Visual Question Answering with Memory-Guided Protection Against Unauthorized Knowledge Use Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MGA-VQA, a multi-modal framework that integrates token-level encoding, spatial graph reasoning, memory-augmented inference, and question-guided compression. |
Ahmad Mohammadshirazi; Pinaki Prasad Guha Neogi; Dheeraj Kulshrestha; Rajiv Ramnath; | arxiv-cs.CV | 2025-11-21 |
| 240 | SMILE: A Composite Lexical-Semantic Metric for Question-Answering Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Model (LLM) based evaluators, though powerful, come with drawbacks like high costs, bias, inconsistency, and hallucinations. To address these issues, we introduce SMILE: Semantic Metric Integrating Lexical Exactness, a novel approach that combines sentence-level semantic understanding with keyword-level semantic understanding and easy keyword matching. |
SHRIKANT KENDRE et. al. | arxiv-cs.CL | 2025-11-21 |
| 241 | EduMod-LLM: A Modular Approach for Designing Flexible and Transparent Educational Assistants Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce {\model}, a modular function-calling LLM pipeline, and present a comprehensive evaluation along three key axes: function calling strategies, retrieval methods, and generative language models. |
Meenakshi Mittal; Rishi Khare; Mihran Miroyan; Chancharik Mitra; Narges Norouzi; | arxiv-cs.CL | 2025-11-21 |
| 242 | Question Answering Models for Information Extraction from Perovskite Materials Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we developed and tested a Question Answering (QA) approach to extract material-property relationships from scientific publications. |
Matilda Sipilä; Farrokh Mehryary; Sampo Pyysalo; Filip Ginter; Milica Todorović; | Communications Materials | 2025-11-20 |
| 243 | ESGBench: A Benchmark for Explainable ESG Question Answering in Corporate Sustainability Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ESGBench, a benchmark dataset and evaluation framework designed to assess explainable ESG question answering systems using corporate sustainability reports. |
Sherine George; Nithish Saji; | arxiv-cs.CL | 2025-11-20 |
| 244 | Detection and Mitigation of Factual Hallucinations in Large Language Models: A Comparative Review of The Timing and Effectiveness of External Retrieval, Post-hoc Verification, and Evaluation Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Regarding the former, this survey reviews IRCoT, SELF-RAG, ReAct, and Atlas, highlighting how combining retrieval with thought chaining can reduce multi-layer errors and improve answer grounding in knowledge-intensive question answering. |
Bingchen Zhou; | Applied and Computational Engineering | 2025-11-19 |
| 245 | Development of An Intelligent Information Retrieval System Based on Ontology, Linguistic Algorithms and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes a hybrid semantic question-answering (QA) system for the Kazakh language that integrates ontological modeling, linguistic processing, and large language models (LLMs). |
Assel Mukanova; Aizhan Nazyrova; Altanbek Zulkhazhav; Zhanar Lamasheva; Assem Dauletkaliyeva; | Applied Sciences | 2025-11-19 |
| 246 | AVATAAR: Agentic Video Answering Via Temporal Adaptive Alignment and Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although large vision language models (LVLMs) have enhanced performance, they often face challenges with nuanced queries that demand both a comprehensive understanding and detailed analysis. To overcome these obstacles, we introduce AVATAAR, a modular and interpretable framework that combines global and local video context, along with a Pre Retrieval Thinking Agent and a Rethink Module. |
Urjitkumar Patel; Fang-Chun Yeh; Chinmay Gondhalekar; | arxiv-cs.CV | 2025-11-19 |
| 247 | PLMAS: Adaptive Sample Selection for Prompting LLMs in Knowledge-Based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing sample selection strategies for in-context learning are oversimplified and fail to adequately leverage the tacit knowledge encoded within LLMs. To address this limitation, we propose an adaptive sample selection strategy that integrates triple similarity calculations (question-image, question-caption, and question-pre-answer) and dynamically assembles the most relevant samples using weighted combinations, thereby effectively activating the large model’s implicit knowledge. |
Jian Li; Quanxing Xu; Ling Zhou; Feifei Zhang; Rubing Huang; | ACM Transactions on Multimedia Computing, Communications, … | 2025-11-19 |
| 248 | Beyond GeneGPT: A Multi-Agent Architecture with Open-Source LLMs for Enhanced Genomic Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, its reliance on a proprietary model limits scalability, increases operational costs, and raises concerns about data privacy and generalization. In this work, we revisit and reproduce GeneGPT in a pilot study using open source models, including Llama 3.1, Qwen2.5, and Qwen2.5 Coder, within a monolithic architecture; this allows us to identify the limitations of this approach. |
Haodong Chen; Guido Zuccon; Teerapong Leelanupab; | arxiv-cs.AI | 2025-11-18 |
| 249 | Multi Table QA:Evaluating Modern LLM Strategies on Table0 Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce tool-augmented reasoning as a paradigm for multi-table QA, and systematically study two complementary strategies: (1) free-form tool interaction, where models iteratively call exploration and computation tools, and(2) structured agent workflows, which stage tool-use into exploration, preparation, and analysis phases. |
Letian Li; | Computers and Artificial Intelligence | 2025-11-18 |
| 250 | SweeperBot: Making 3D Browsing Accessible Through View Analysis and Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Grounded on a formative study, this paper introduces SweeperBot, a system that enables SR users to leverage visual question answering to explore and compare 3D models. |
Chen Chen; Cuong Nguyen; Alexa Siu; Dingzeyu Li; Nadir Weibel; | arxiv-cs.HC | 2025-11-18 |
| 251 | Foundational Question Generation for Video Question Answering Via An Embedding-Integrated Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Foundational Question Generation for Video Question Answering via an Embedding-Integrated Approach (FIQ), a framework designed to enhance the reasoning capability of VQA models by improving their foundational comprehension of video content. |
Ju-Young Oh; | arxiv-cs.CV | 2025-11-18 |
| 252 | BBox DocVQA: A Large Scale Bounding Box Grounded Dataset for Enhancing Reasoning in Document Visual Question Answer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing DocVQA datasets are limited to the page level and lack fine grained spatial grounding, constraining the interpretability and reasoning capability of Vision Language Models (VLMs). To address this gap, we introduce BBox DocVQA a large scale, bounding box grounded dataset designed to enhance spatial reasoning and evidence localization in visual documents. |
WENHAN YU et. al. | arxiv-cs.DB | 2025-11-18 |
| 253 | Audio Question Answering with GRPO-Based Fine-Tuning and Calibrated Segment-Level Predictions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this report, we describe our submission to Track 5 of the DCASE 2025 Challenge for the task of Audio Question Answering(AQA). |
MARCEL GIBIER et. al. | arxiv-cs.SD | 2025-11-18 |
| 254 | Descriptor: Distance-Annotated Traffic Perception Question Answering (DTPQA) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, we provide the dataset itself along with the Python scripts used to create it, which can be used to generate additional data of the same kind. |
NIKOS THEODORIDIS et. al. | arxiv-cs.CV | 2025-11-17 |
| 255 | Geospatial Chain of Thought Reasoning for Enhanced Visual Question Answering on Satellite Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a VQA framework that integrates CoT reasoning with Direct Preference Optimization (DPO) to improve interpretability, robustness, and accuracy. |
Shambhavi Shanker; Manikandan Padmanaban; Jagabondhu Hazra; | arxiv-cs.CV | 2025-11-14 |
| 256 | A Visual Question Answering Method Based on Task Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conducted validation on four datasets: CLEVR, CLEVR-Human, CLEVR-CoGenT and GQA. |
Yao Cong; Hongwei Mo; | PLOS One | 2025-11-13 |
| 257 | SCARE: A Benchmark for SQL Correction and Question Answerability Classification for Reliable EHR Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While prior work has mainly focused on improving SQL generation accuracy or filtering questions before execution, there is a lack of a unified benchmark for evaluating independent post-hoc verification mechanisms (i.e., a component that inspects and validates the generated SQL before execution), which is crucial for safe deployment. To fill this gap, we introduce SCARE, a benchmark for evaluating methods that function as a post-hoc safety layer in EHR QA systems. |
Gyubok Lee; Woosog Chay; Edward Choi; | arxiv-cs.CL | 2025-11-13 |
| 258 | Local Hybrid Retrieval-Augmented Document QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Organizations handling sensitive documents face a critical dilemma: adopt cloud-based AI systems that offer powerful question-answering capabilities but compromise data privacy, or maintain local processing that ensures security but delivers poor accuracy. We present a question-answering system that resolves this trade-off by combining semantic understanding with keyword precision, operating entirely on local infrastructure without internet access. |
Paolo Astrino; | arxiv-cs.CL | 2025-11-13 |
| 259 | ProgRAG: Hallucination-Resistant Progressive Retrieval and Reasoning Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the retrieval and reasoning failures, we propose ProgRAG, a multi-hop knowledge graph question answering (KGQA) framework that decomposes complex questions into sub-questions, and progressively extends partial reasoning paths by answering each sub-question. |
Minbae Park; Hyemin Yang; Jeonghyun Kim; Kunsoo Park; Hyunjoon Kim; | arxiv-cs.AI | 2025-11-13 |
| 260 | OIDA-QA: A Multimodal Benchmark for Analyzing The Opioid Industry Documents Archive Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The complexity, multimodal nature, and specialized characteristics of these healthcare-related legal and corporate documents necessitate more advanced methods and models tailored to specific data types and detailed annotations, ensuring the precision and professionalism in the analysis. In this paper, we tackle this challenge by organizing the original dataset according to document attributes and constructing a benchmark with 400k training documents and 10k for testing. |
XUAN SHEN et. al. | arxiv-cs.AI | 2025-11-12 |
| 261 | A Hybrid Search for Complex Table Question Answering in Securities Report Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a cell extraction method for TQA without manual identification, even for complex table headers. |
Daiki Shirafuji; Koji Tanaka; Tatsuhiko Saito; | arxiv-cs.CL | 2025-11-12 |
| 262 | Answering Students’ Questions on Course Forums Using Multiple Chain-of-Thought Reasoning and Finetuning RAG-Enabled LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we experiment fine-tuned LLM with RAG method on the HotpotQA dataset. |
Neo Wang; Sonit Singh; | arxiv-cs.CL | 2025-11-12 |
| 263 | Testing Question Answering Software with Context-Driven Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce CQ^2A, a context-driven question generation approach for testing question-answering systems. |
SHUANG LIU et. al. | arxiv-cs.SE | 2025-11-11 |
| 264 | Self-Correction Distillation for Structured Data Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To improve the structured data QA ability of small-scale LLMs, we propose a self-correction distillation (SCD) method. |
YUSHAN ZHU et. al. | arxiv-cs.CL | 2025-11-11 |
| 265 | QueryBridge: One Million Annotated Questions with SPARQL Queries – Dataset for Question Answering Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing benchmark datasets (e.g., QALD, LC-QuAD) are limited in size and annotation, hindering QAKG model generalization. To address this, we present QueryBridge, a dataset with over one million annotated questions paired with SPARQL queries. |
Abdelghny Orogat; Ahmed El-Roby; | cikm | 2025-11-10 |
| 266 | StreamKV: Streaming Video Question-Answering with Segment-based KV Cache Retrieval and Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose \textbf{StreamKV}, atraining-free framework that seamlessly equips Video-LLMs with advanced KVcache retrieval and compression. |
YILONG CHEN et. al. | arxiv-cs.CV | 2025-11-10 |
| 267 | Structuring Video Semantics with Temporal Triplets for Zero-Shot Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a structured representation based on temporal triplets to address two major challenges in traditional approaches: temporal fragmentation and entity reference ambiguity. |
LINLIN ZONG et. al. | cikm | 2025-11-10 |
| 268 | A Pivot-Enhanced Question Answering Framework: Using Iterative Sub-Question Decomposition and Answer-to-Question Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Question and Answering(QA) in low-resource languages remains a significant challenge due to the scarcity of high-quality training data. To address this, we propose a robust framework for low-resource QA. |
Seyeon Park; Beakcheol Jang; | cikm | 2025-11-10 |
| 269 | FinSage: A Multi-aspect RAG System for Financial Filings Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose the FinSage framework as a solution, utilizing a multi-aspect RAG framework tailored for data retrieval and summarization in multi-modal financial documents. |
XINYU WANG et. al. | cikm | 2025-11-10 |
| 270 | C-FAITH: A Chinese Fine-Grained Benchmark for Automated Hallucination Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing hallucination benchmarks (especially in Chinese language) rely on human annotations, making automatical and cost-effective hallucination evaluation challenging. To address this, we introduce HaluAgent, an agentic framework that automatically constructs fine-grained question-answering (QA) dataset based on some knowledge documents. |
XU ZHANG et. al. | cikm | 2025-11-10 |
| 271 | Bridging The Gap Between Knowledge Graphs and LLMs for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a novel structure-text knowledge synergistic method, BrikQA, which bridges the knowledge gap between knowledge graphs (KGs) and LLMs for multi-hop KGQA. |
Shijie Luo; Xinyuan Lu; Qinpei Zhao; Weixiong Rao; | cikm | 2025-11-10 |
| 272 | Evaluating Robustness of LLMs in Question Answering on Multilingual Noisy OCR Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we conduct a comprehensive analysis of how OCR-induced noise affects the performance of Multilingual QA Systems. |
Bhawna Piryani; Jamshid Mozafari; Abdelrahman Abdallah; Antoine Doucet; Adam Jatowt; | cikm | 2025-11-10 |
| 273 | Reference-Aligned Retrieval-Augmented Question Answering Over Heterogeneous Proprietary Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these, we propose a RAG-QA framework for internal enterprise use, consisting of: (1) a data pipeline that converts raw multi-modal documents into a structured corpus and QA pairs, (2) a fully on-premise, privacy-preserving architecture, and (3) a lightweight reference matcher that links answer segments to supporting content. |
NAYOUNG CHOI et. al. | cikm | 2025-11-10 |
| 274 | Disentangling Complex Questions in LLMs Via Multi-Hop Dependency Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel prompt approach for multi-hop QA viz., MoDeGraph (Multi-Hop Dependency Graphs), that is designed to steer LLMs to extract and model entity relationships in complex questions. |
ROLAND ORUCHE et. al. | cikm | 2025-11-10 |
| 275 | SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present SQuAI (https://squai.scads.ai/), a scalable and trustworthy multi-agent retrieval-augmented generation (RAG) framework for scientific question answering (QA) with large language models (LLMs). |
Ines Besrour; Jingbo He; Tobias Schreieder; Michael F\{a}rber; | cikm | 2025-11-10 |
| 276 | NLP-QA: A Large-scale Benchmark for Informative Question Answering Over Natural Language Processing Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, scholarly QA development is hindered by the scarcity of large-scale, expertly-annotated datasets, that are needed for modern deep learning models. To address this gap and advance scholarly QA, we introduce NLP-QA, a new dataset of question-answer pairs derived from NLP research documents. |
Avishek Lahiri; Debarshi Kumar Sanyal; Imon Mukherjee; | cikm | 2025-11-10 |
| 277 | Revisiting NLI: Towards Cost-Effective and Human-Aligned Metrics for Evaluating LLMs in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We re-evaluate a lightweight alternative — off-the-shelf Natural Language Inference (NLI) scoring augmented by a simple lexical-match flag and find that this decades-old technique matches GPT-4o’s accuracy (89.9%) on long-form QA, while requiring orders-of-magnitude fewer parameters. To test human alignment of these metrics rigorously, we introduce DIVER-QA, a new 3000-sample human-annotated benchmark spanning five QA datasets and five candidate LLMs. |
Sai Shridhar Balamurali; Lu Cheng; | arxiv-cs.CL | 2025-11-10 |
| 278 | PEQQS: A Dataset for Probing Extractive Quantity-focused Question Answering from Scientific Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present how our dataset can be used both for the evaluation of extractive quantity-focused QA from science literature and for exploring the impact of search on the downstream results, specifically focusing on hallucinations resulting from processing non-relevant documents with LLMs. |
Maciej Rybinski; Necva B\{o}l\{u}c\{u}; Huichen Yang; Stephen Wan; | cikm | 2025-11-10 |
| 279 | Company-Specific Knowledge Matters: Retrieval-Augmented Generation for Earnings Call Answer Rehearsal Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explores how to better support corporate executives in answering questions from professional analysts during earnings calls. |
Yung-Yu Shih; Yun-Nung Chen; Chung-Chi Chen; | cikm | 2025-11-10 |
| 280 | VQA-Induct: Instruction Induction for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current approaches for enhancing VQA reasoning performance often assume access to extensive resources such as large annotated datasets, external tools, or numerous demonstrations, which are impractical for real-world users who typically possess only a few demonstrations. We present VQA-Induct, a framework for data-scarce scenarios that leverages MLLMs’ instruction induction capabilities to induce reusable, purely textual task-level instructions from as few as three demonstrations of the same task, then applies these instructions to new instances using only their image-question pairs. |
Po-Chun Chen; Hen-Hsen Huang; Hsin-Hsi Chen; | cikm | 2025-11-10 |
| 281 | BookAsSumQA: An Evaluation Framework for Aspect-Based Book Summarization Via Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: BookAsSumQA automatically generates aspect-specific QA pairsfrom a narrative knowledge graph to evaluate summary quality based on itsquestion-answering performance. Our experiments using BookAsSumQA revealed thatwhile LLM-based approaches showed higher accuracy on shorter texts, RAG-basedmethods become more effective as document length increases, making them moreefficient and practical for aspect-based book summarization. |
Ryuhei Miyazato; Ting-Ruen Wei; Xuyang Wu; Hsin-Tai Wu; Kei Harada; | arxiv-cs.CL | 2025-11-08 |
| 282 | PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts Into Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Parameter-efficient fine-tuning (PEFT) methods have shown promise in adapting large language models, yet existing approaches exhibit counter-intuitive phenomena: integrating either matrix decomposition or mixture-of-experts (MoE) individually decreases performance across tasks, though decomposition improves results on specific domains despite reducing parameters, while MoE increases parameter count without corresponding decrease in training efficiency. Motivated by these observations and the modular nature of PT, we propose PT-MoE, a novel framework that integrates matrix decomposition with MoE routing for efficient PT. |
Zongqian Li; Yixuan Su; Nigel Collier; | nips | 2025-11-07 |
| 283 | Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking By Contrasting Layers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing RAG methods for simple and multi-hop question answering (QA) are still prone to incorrect retrievals and hallucinations. To address these limitations, we propose CoopRAG, a novel RAG framework for the question answering task in which a retriever and an LLM work cooperatively with each other by exchanging informative knowledge, and the earlier and later layers of the retriever model work cooperatively with each other to accurately rank the retrieved documents relevant to a given query. |
Youmin Ko; Sung Jong Seo; Hyunjoon Kim; | nips | 2025-11-07 |
| 284 | DSAS: A Universal Plug-and-Play Framework for Attention Optimization in Multi-Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current solutions either truncate global dependencies or demand costly finetuning, ultimately lacking a universal and simple solution for these challenges. To resolve these limitations, we propose Dual-Stage Adaptive Sharpening (DSAS) containing two modules. |
JIAKAI LI et. al. | nips | 2025-11-07 |
| 285 | QuAnTS: Question Answering on Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We verify that the large-scale QuAnTS dataset iswell-formed and comprehensive through extensive experiments. |
FELIX DIVO et. al. | arxiv-cs.LG | 2025-11-07 |
| 286 | Temporal Chain of Thought: Long-Video Understanding By Thinking in Frames Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Dynamic Context Aggregation, an inference strategy for video question-answering that curates the model’s input context. |
Anurag Arnab; Ahmet Iscen; Mathilde Caron; Alireza Fathi; Cordelia Schmid; | nips | 2025-11-07 |
| 287 | Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We apply SynthKGQA to Wikidata to generate GTSQA, a new datasetdesigned to test zero-shot generalization abilities of KG retrievers withrespect to unseen graph structures and relation types, and benchmark popularsolutions for KG-augmented LLMs on it. |
Alberto Cattaneo; Carlo Luschi; Daniel Justus; | arxiv-cs.LG | 2025-11-06 |
| 288 | The Illusion of Certainty: Uncertainty Quantification for LLMs Fails Under Ambiguity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While real-world language is inherentlyambiguous, reflecting aleatoric uncertainty, existing UQ methods are typicallybenchmarked against tasks with no ambiguity. In this work, we demonstrate thatwhile current uncertainty estimators perform well under the restrictiveassumption of no ambiguity, they degrade to close-to-random performance onambiguous data. |
Tim Tomov; Dominik Fuchsgruber; Tom Wollschläger; Stephan Günnemann; | arxiv-cs.LG | 2025-11-06 |
| 289 | BanglaMedQA and BanglaMMedBench: Evaluating Retrieval-Augmented Generation Strategies for Bangla Biomedical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces BanglaMedQA andBanglaMMedBench, the first large-scale Bangla biomedical Multiple ChoiceQuestion (MCQ) datasets designed to evaluate reasoning and retrieval in medicalartificial intelligence (AI). |
Sadia Sultana; Saiyma Sittul Muna; Mosammat Zannatul Samarukh; Ajwad Abrar; Tareque Mohmud Chowdhury; | arxiv-cs.CL | 2025-11-06 |
| 290 | Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) struggle with this task, frequentlyfailing to interpret user intent (misinterpretation) or unnecessarily alteringthe original question’s structure (over-correction). We propose QuestionRAG, aframework that tackles these problems. |
Longpeng Qiu; Ting Li; Shuai Mao; Nan Yang; Xiaohui Yan; | arxiv-cs.CL | 2025-11-05 |
| 291 | Comparing The Performance of LLMs in RAG-based Question-Answering: A Case Study in Computer Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Evaluationmetrics employed in the study include accuracy and precision for binaryquestions and ranking by a human expert, ranking by Google’s AI model Gemini,alongside cosine similarity for long-answer questions. |
Ranul Dayarathne; Uvini Ranaweera; Upeksha Ganegoda; | arxiv-cs.CL | 2025-11-05 |
| 292 | ChiMDQA: Towards Comprehensive Chinese Document QA with Fine-grained Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With the rapid advancement of natural language processing (NLP) technologies,the demand for high-quality Chinese document question-answering datasets issteadily growing. To address this issue, we present the Chinese Multi-DocumentQuestion Answering Dataset(ChiMDQA), specifically designed for downstreambusiness scenarios across prevalent domains including academic, education,finance, law, medical treatment, and news. |
Jing Gao; Shutiao Luo; Yumeng Liu; Yuanming Li; Hongji Zeng; | arxiv-cs.CL | 2025-11-05 |
| 293 | Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Through attention analysis, we find that learning new knowledgereduces the model’s attention to key entities in the question, thus causingexcessive focus on the surrounding context, which may increase the risk ofhallucination. |
Renfei Dang; Peng Hu; Changjiang Gao; Shujian Huang; | arxiv-cs.CL | 2025-11-04 |
| 294 | When to Trust The Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Question Aligned Semantic Nearest NeighborEntropy (QA-SNNE), a black box uncertainty estimator that incorporates questionsemantics into prediction confidence. |
DENNIS PIERANTOZZI et. al. | arxiv-cs.CV | 2025-11-03 |
| 295 | Improving Construction Contract Question Answering Through Embedding Optimization and Semantic Chunking in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Lanqian Zhang; Yan Ning; | Advanced Engineering Informatics | 2025-11-03 |
| 296 | Hybrid Retrieval-Augmented Generation Agent for Trustworthy Legal Question Answering in Judicial Forensics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As artificial intelligence permeates judicial forensics, ensuring theveracity and traceability of legal question answering (QA) has become critical.Conventional large language … |
YUEQING XI et. al. | arxiv-cs.AI | 2025-11-03 |
| 297 | DEEPAMBIGQA: Ambiguous Multi-hop Questions for Benchmarking LLM Answer Completeness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing QA benchmarks rarely evaluateboth challenges jointly. To address this, we introduce DeepAmbigQAGen, anautomatic data generation pipeline that constructs QA tasks grounded in textcorpora and linked knowledge graph, generating natural and verifiable questionsthat systematically embed name ambiguity and multi-step reasoning. |
Jiabao Ji; Min Li; Priyanshu Kumar; Shiyu Chang; Saloni Potdar; | arxiv-cs.CL | 2025-11-03 |
| 298 | A Graph-based RAG for Energy Efficiency Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we investigate the use of Large Language Models (LLMs) within agraph-based Retrieval Augmented Generation (RAG) architecture for EnergyEfficiency (EE) Question Answering. |
RICCARDO CAMPI et. al. | arxiv-cs.CL | 2025-11-03 |
| 299 | Text-VQA Aug: Pipelined Harnessing of Large Multimodal Models for Automated Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a pipeline for automated synthesis for text-VQA dataset thatcan produce faithful QA pairs, and which scales up with the availability ofscene text data. |
Soham Joshi; Shwet Kamal Mishra; Viswanath Gopalakrishnan; | arxiv-cs.CV | 2025-11-03 |
| 300 | StepSearch: Igniting LLMs Search Ability Via Step-Wise Proximal Policy Optimization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Previous work has explored reinforcement learning (RL) to train LLMs to perform search-based document retrieval, achieving notable improvements in QA performance, but underperform on complex, multi-hop QA resulting from the sparse rewards from global signal only. To address this gap in existing research, we introduce StepSearch, a framework for search LLMs that trained with step-wise proximal policy optimization method. |
Xuhui Zheng; Kang An; Ziliang Wang; Yuhang Wang; Yichao Wu; | emnlp | 2025-11-02 |
| 301 | Tagging-Augmented Generation: Assisting Language Models in Finding Intricate Knowledge In Long Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Tagging-Augmented Generation (TAG), a lightweight data augmentation strategy that boosts LLM performance in long-context scenarios, without degrading and altering the integrity and composition of retrieved documents. |
ANWESAN PAL et. al. | emnlp | 2025-11-02 |
| 302 | CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Embodied Question Answering (EQA) has primarily focused on indoor environments, leaving the complexities of urban settings—spanning environment, action, and perception—largely unexplored. To bridge this gap, we introduce CityEQA, a new task where an embodied agent answers open-vocabulary questions through active exploration in dynamic city spaces. |
YONG ZHAO et. al. | emnlp | 2025-11-02 |
| 303 | SilVar: Speech-Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, the quality of language models primarily depends on reasoning and prompting techniques, such as chain-of-thought, which remain underexplored when using speech instructions. To address these challenges, we propose SilVar, an end-to-end multimodal model that leverages speech instructions for reasoning-based visual question answering. |
Tan-Hanh Pham; Le Hoang Nam; Phu-Vinh Nguyen; Chris Ngo; Truong-Son Hy; | emnlp | 2025-11-02 |
| 304 | Don’t Forget The Base Retriever! A Low-Resource Graph-based Retriever for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose GR\small{IEVER}, a lightweight, low-resource, multi-step graph-based retriever for multi-hop QA. |
ANDRE MELO et. al. | emnlp | 2025-11-02 |
| 305 | TVQACML: Benchmarking Text-Centric Visual Question Answering in Multilingual Chinese Minority Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, the open-source nature of these benchmarks and the broad sources of training data for MLLMs have inevitably led to benchmark contamination, resulting in unreliable evaluation results. To alleviate this issue, we propose a contamination-free and more challenging TEC-VQA benchmark called Text-Centric Visual Question Answering in Multilingual Chinese Minority Languages(TVQACML), which involves eight languages, including Standard Chinese, Korean, and six minority languages. |
Sha Jiu; Yu Weng; Mengxiao Zhu; Chong Feng; Zheng Liu; | emnlp | 2025-11-02 |
| 306 | Memory-QA: Answering Recall Questions Based on Multimodal Memories Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This task poses unique challenges, including the creation of task-oriented memories, the effective utilization of temporal and location information within memories, and the ability to draw upon multiple memories to answer a recall question. To address these challenges, we propose a comprehensive pipeline, Pensieve, integrating memory-specific augmentation, time- and location-aware multi-signal retrieval, and multi-memory QA fine-tuning. |
HONGDA JIANG et. al. | emnlp | 2025-11-02 |
| 307 | TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The prevailing practice of flattening tables and chunking strategies disrupts the intrinsic tabular structure, leads to information loss, and undermines the reasoning capabilities of LLMs in multi-hop, global queries. To address these challenges, we propose TableRAG, an SQL-based framework that unifies textual understanding and complex manipulations over tabular data. |
Xiaohan Yu; Pu Jian; Chong Chen; | emnlp | 2025-11-02 |
| 308 | KoBLEX: Open Legal Question Answering with Multi-hop Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these benchmarks fail to evaluate open-ended and provision-grounded Question Answering (QA). To address this, we introduce a Korean Benchmark for Legal EXplainable QA (KoBLEX), designed to evaluate provision-grounded, multi-hop legal reasoning. |
Jihyung Lee; Daehui Kim; Seonjeong Hwang; Hyounghun Kim; Gary Lee; | emnlp | 2025-11-02 |
| 309 | NitiBench: Benchmarking LLM Frameworks on Thai Legal Question Answering Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce NitiBench, a novel benchmark featuring two datasets: (1) NitiBench-CCL, covering Thai financial laws, and (2) NitiBench-Tax, containing Thailand’s official tax rulings. |
PAWITSAPAK AKARAJARADWONG et. al. | emnlp | 2025-11-02 |
| 310 | From Chat Logs to Collective Insights: Aggregative Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Aggregative Question Answering, a novel task requiring models to reason explicitly over thousands of user-chatbot interactions to answer aggregational queries, such as identifying emerging concerns among specific demographics. |
Wentao Zhang; Woojeong Kim; Yuntian Deng; | emnlp | 2025-11-02 |
| 311 | StepER: Step-wise Knowledge Distillation for Enhancing Reasoning Ability in Multi-Step Retrieval-Augmented Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing knowledge distillation methods overlook the need for different reasoning abilities at different steps, hindering transfer in multi-step retrieval-augmented frameworks. To address this, we propose Step-wise Knowledge Distillation for Enhancing Reasoning Ability in Multi-Step Retrieval-Augmented Language Models (StepER). |
Kyumin Lee; Minjin Jeon; Sanghwan Jang; Hwanjo Yu; | emnlp | 2025-11-02 |
| 312 | Generating Spatial Knowledge Graphs from Automotive Diagrams for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate three distinct generation pipelines (Per-Attribute, Per-Component, and a Single-Shot baseline) to create the SKG using Large Vision-Language Models (LVLMs). |
Steve Bakos; Chen Xing; Heidar Davoudi; Aijun An; Ron DiCarlantonio; | emnlp | 2025-11-02 |
| 313 | Truth, Trust, and Trouble: Medical AI on The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a rigorous benchmarking framework via a dataset of over 1,000 health questions. |
MOHAMMAD ANAS AZEEZ et. al. | emnlp | 2025-11-02 |
| 314 | ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While large language models (LLMs) have achieved substantial improvements via chain-of-thought (CoT) prompting and retrieval-augmented generation, these methods typically adopt a forward-only workflow—early mistakes persist throughout inference, and contradictions discovered later cannot systematically trigger re-evaluation. To address this limitation, we present ReAgent, a reversible multi-agent reasoning framework. |
ZHAO XINJIE et. al. | emnlp | 2025-11-02 |
| 315 | Discrepancy Detection at The Data Level: Toward Consistent Multilingual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MIND, a user-in-the-loop fact-checking pipeline to detect factual and cultural discrepancies in multilingual QA knowledge bases. |
LORENA CALVO-BARTOLOMÉ et. al. | emnlp | 2025-11-02 |
| 316 | Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce EverGreenQA, the first multilingual QA dataset with evergreen labels, supporting both evaluation and training. |
SERGEY PLETENEV et. al. | emnlp | 2025-11-02 |
| 317 | LaMP-QA: A Benchmark for Personalized Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This is mainly due to lack of resources for training and evaluating personalized question answering systems. We address this gap by introducing LaMP-QA—a benchmark designed for evaluating personalized long-form answer generation. |
Alireza Salemi; Hamed Zamani; | emnlp | 2025-11-02 |
| 318 | CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they still face challenges in balancing retrieval precision and recall, impacting their efficacy in answering questions. To address this, we introduce **CAFE**, a two-stage coarse-to-fine method to enhance multi-document question-answering capacities. |
Han Peng; Jinhao Jiang; Zican Dong; Xin Zhao; Lei Fang; | emnlp | 2025-11-02 |
| 319 | ComplexTempQA: A 100m Dataset for Complex Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ComplexTempQA,a large-scale dataset consisting of over 100 million question-answer pairs designed to tackle the challenges in temporal question answering. |
Raphael Gruber; Abdelrahman Abdallah; Michael Färber; Adam Jatowt; | emnlp | 2025-11-02 |
| 320 | FLARE: Faithful Logic-Aided Reasoning and Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Faithful Logic-Aided Reasoning and Exploration (FLARE), which uses LLMs to plan solutions, formalize queries into logic programs, and simulate code execution through multi-hop search without external solvers. |
Erik Arakelyan; Pasquale Minervini; Patrick Lewis; Pat Verga; Isabelle Augenstein; | emnlp | 2025-11-02 |
| 321 | Faster In-Context Learning for LLMs Via N-Gram Trie Speculative Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the lengthy retrieved contexts and limited token throughput in autoregressive models significantly constrain reasoning speed. To address this challenge, we propose N-Gram Trie Speculative Decoding, a novel approach that leverages the overlap between context and model output. |
JINGLIN CHEN et. al. | emnlp | 2025-11-02 |
| 322 | LiteraryQA: Towards Effective Evaluation of Long-document Narrative QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce LiteraryQA, a high-quality subset of NarrativeQA focused on literary works. |
Tommaso Bonomo; Luca Gioffré; Roberto Navigli; | emnlp | 2025-11-02 |
| 323 | Static or Dynamic: Towards Query-Adaptive Token Selection for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel token selection strategy, explore-then-select, that adaptively adjusts static and dynamic information based on question requirements. |
Yumeng Shi; Quanyu Long; Wenya Wang; | emnlp | 2025-11-02 |
| 324 | RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current temporal knowledge graph question answering (TKGQA) methods primarily focus on implicit temporal constraints, lacking the capability to handle more complex temporal queries, and struggle with limited reasoning abilities and error propagation in decomposition frameworks. We propose RTQA, a novel framework to address these challenges by enhancing reasoning over TKGs without requiring training. |
ZHAOYAN GONG et. al. | emnlp | 2025-11-02 |
| 325 | Factual and Musical Evaluation Metrics for Music Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To measure the true performance of Music LMs,we propose (1) a better general-purpose evaluation metric for Music LMs adaptedto the music domain and (2) a factual evaluation framework to quantify thecorrectness of a Music LM’s responses. |
Daniel Chenyu Lin; Michael Freeman; John Thickstun; | arxiv-cs.SD | 2025-11-02 |
| 326 | TALON: A Multi-Agent Framework for Long-Table Exploration and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose TALON, a multi-agent framework designed for question answering over long tables. |
RUOCHUN JIN et. al. | emnlp | 2025-11-02 |
| 327 | CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Therefore, identifying possible implicit assumptions is crucial in QA. To address this fundamental challenge, we propose Conditional Ambiguous Question-Answering (CondAmbigQA), a benchmark comprising 2,000 ambiguous queries and condition-aware evaluation metrics. |
Zongxi Li; Yang Li; Haoran Xie; S. Joe Qin; | emnlp | 2025-11-02 |
| 328 | XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This assumption neglects the cultural and regional variations that affect question understanding and answer, leading to biased evaluation in multilingual benchmarks. To address these limitations, we introduce XLQA, a novel benchmark explicitly designed for locale-sensitive multilingual ODQA. |
Keonwoo Roh; Yeong-Joon Ju; Seong-Whan Lee; | emnlp | 2025-11-02 |
| 329 | MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although Large Language Models (LLMs) and Retrieval-augmented Generation (RAG) systems show promise, their performance on cross-document MEQA remains underexplored due to the absence of tailored benchmarks. To address this gap, we introduce MEBench, a scalable multi-document, multi-entity benchmark designed to systematically evaluate LLMs’ capacity to retrieve, consolidate, and reason over scattered and dense information. |
TENG LIN et. al. | emnlp | 2025-11-02 |
| 330 | FacLens: Transferable Probe for Foreseeing Non-Factuality in Fact-Seeking Question Answering of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a lightweight model named Factuality Lens (FacLens), which effectively probes hidden representations of fact-seeking questions for the NFP task. |
YANLING WANG et. al. | emnlp | 2025-11-02 |
| 331 | ProtoVQA: An Adaptable Prototypical Framework for Explainable Fine-Grained Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ProtoVQA, a unified prototypical framework that (i) learns question-aware prototypes that serve as reasoning anchors, connecting answers to discriminative image regions, (ii) applies spatially constrained matching to ensure that the selected evidence is coherent and semantically relevant, and (iii) supports both answering and grounding tasks through a shared prototype backbone. |
XINGJIAN DIAO et. al. | emnlp | 2025-11-02 |
| 332 | Refining Attention for Explainable and Noise-Robust Fact-Checking with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Conventional transformer-based models excel at classifying input data, but (i) often falter due to sensitivity to noise and (ii) lack explainability regarding their decision process. To address these challenges, we introduce ATTUN, a novel transformer architecture designed to enhance model transparency and resilience to noise by refining the attention mechanisms. |
Jean-Flavien Bussotti; Paolo Papotti; | emnlp | 2025-11-02 |
| 333 | T2: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: But they add human bias to the reasoning process and fail to leverage models’ inherent reasoning capabilities. To address these limitations, we present T2: Think-to-Think, a novel framework that dynamically adapts reasoning depth based on question complexity. |
ZHENGYI ZHAO et. al. | emnlp | 2025-11-02 |
| 334 | TreeRare: Syntax Tree-Guided Retrieval and Reasoning for Knowledge-Intensive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the performance of such retrieval frameworks is limited by the accumulation of reasoning errors and misaligned retrieval results. To overcome these limitations, we propose TreeRare (Syntax Tree-Guided Retrieval and Reasoning, a framework that utilizes syntax trees to guide information retrieval and reasoning for question answering. |
Boyi Zhang; Zhuo Liu; Hangfeng He; | emnlp | 2025-11-02 |
| 335 | RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the lack of publicly available RAG-centric preference datasets and specialised RMs, we introduce RAGferee, a methodology that repurposes question-answering (QA) datasets into preference pairs that prioritise groundedness over stylistic features, enabling the training of contextual RMs better suited to judging RAG responses. |
ANDREI CATALIN COMAN et. al. | emnlp | 2025-11-02 |
| 336 | CompKBQA: Component-wise Task Decomposition for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the challenge of generating error-free logical forms remains, as skeleton, topic Entity, and relation Errors still frequently occur. To address these challenges, we propose CompKBQA(Component-wise Task Decomposition for Knowledge Base Question Answering), a novel framework that optimizes the process of fine-tuning a LLM for generating logical forms by enabling the LLM to progressively learn relevant sub-tasks like skeleton generation, topic entity generation, and relevant relations generation. |
YUHANG TIAN et. al. | emnlp | 2025-11-02 |
| 337 | Weaver: Interweaving SQL and LLM for Table Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing approaches that combine SQL and LLM typically rely on rigid, predefined workflows, limiting their adaptability to complex queries. To address these issues, we introduce Weaver, a modular pipeline that dynamically integrates SQL and LLM for table-based question answering (Table QA). |
Rohit Khoja; Devanshu Gupta; Yanjie Fu; Dan Roth; Vivek Gupta; | emnlp | 2025-11-02 |
| 338 | How Accurate Are LLMs at Multi-Question Answering on Conversational Transcripts? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore the capabilities of LLMs to answer multiple questions based on the same conversational context. |
Xiliang Zhu; Shi Zong; David Rossouw; | emnlp | 2025-11-02 |
| 339 | CoCoA: Confidence- and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce CoCoA (Confidence- and Context-Aware Adaptive Decoding), a novel token-level algorithm for principled conflict resolution and enhanced faithfulness. |
Anant Khandelwal; Manish Gupta; Puneet Agrawal; | emnlp | 2025-11-02 |
| 340 | Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this survey, we propose a new structured taxonomy that categorizes the methodology of synthesizing LLMs and KGs for QA according to the categories of QA and the KG’s role when integrating with LLMs. |
Chuangtao Ma; Yongrui Chen; Tianxing Wu; Arijit Khan; Haofen Wang; | emnlp | 2025-11-02 |
| 341 | Answering Narrative-Driven Recommendation Queries Via A Retrieve–Rank Paradigm and The OCG-Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work formally introduces narrative recommendation as a distinct task and contends that the RAG paradigm is inherently ill-suited for it, owing to information loss in LLMs when retrieving information from from multiple long and fragmented contexts, and limitations in ranking effectiveness. |
YUNXIAO SHI et. al. | emnlp | 2025-11-02 |
| 342 | SportReason: Evaluating Retrieval-Augmented Reasoning Across Tables and Text for Sports Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present SportReason, a benchmark for retrieval-augmented reasoning on numerical sports questions. |
Kaiyue Feng; Siyue Zhang; Bingsen Chen; Yilun Zhao; Chen Zhao; | emnlp | 2025-11-02 |
| 343 | Trustworthy Medical Question Answering: An Evaluation-Centric Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we systematically examine six key dimensions of trustworthiness in medical QA, i. e. , Factuality, Robustness, Fairness, Safety, Explainability, and Calibration. |
YINUO WANG et. al. | emnlp | 2025-11-02 |
| 344 | Confidence-guided Refinement Reasoning for Zero-shot Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Confidence-guided Refinement Reasoning (C2R), a novel training-free framework applicable to question-answering (QA) tasks across text, image, and video domains. |
Youwon Jang; Woo Suk Choi; Minjoon Jung; Minsu Lee; Byoung-Tak Zhang; | emnlp | 2025-11-02 |
| 345 | What Are Foundation Models Cooking in The Post-Soviet World? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate the Post-Soviet cultural food knowledge of foundation models by constructing BORSch, a multi-modal dataset encompassing 1147 and 823 dishes in the Russian and Ukrainian languages, centered around the Post-Soviet region. |
Anton Lavrouk; Tarek Naous; Alan Ritter; Wei Xu; | emnlp | 2025-11-02 |
| 346 | BYOKG-RAG: Multi-Strategy Graph Retrieval for Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce BYOKG-RAG, a framework that enhances KGQA by synergistically combining LLMs with specialized graph retrieval tools. |
COSTAS MAVROMATIS et. al. | emnlp | 2025-11-02 |
| 347 | PakBBQ: A Culturally Adapted Bias Benchmark for QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most LLMs are trained and evaluated on Western centric data, with little attention paid to low-resource languages and regional contexts. To address this gap, we introduce PakBBQ, a culturally and regionally adapted extension of the original Bias Benchmark for Question Answering (BBQ) dataset. |
Abdullah Hashmat; Muhammad Arham Mirza; Agha Ali Raza; | emnlp | 2025-11-02 |
| 348 | UNCLE: Benchmarking Uncertainty Expressions in Long-Form Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Along with UNCLE, we propose a suite of new metrics to assess the models’ capabilities to selectively express uncertainty. |
RUIHAN YANG et. al. | emnlp | 2025-11-02 |
| 349 | RAVEN: Query-Guided Representation Alignment for Question Answering Over Audio, Video, Embedded Sensors, and Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present RAVEN, a unified QA architecture whose core is QuART, a query-conditioned cross-modal gating module that assigns scalar relevance scores to each token across modalities, enabling the model to amplify informative signals and suppress distractors before fusion. |
Subrata Biswas; Mohammad Nur Hossain Khan; Bashima Islam; | emnlp | 2025-11-02 |
| 350 | Retrieving Support to Rank Answers in Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel Question Answering (QA) architecture that enhances answer selection by retrieving targeted supporting evidence. |
Zeyu Zhang; Alessandro Moschitti; Thuy Vu; | emnlp | 2025-11-02 |
| 351 | Enhancing Long-form Question Answering Via Reflection with Question Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Junjie Xiao; Wei Wu; Jiaxu Zhao; Meng Fang; Jianxin Wang; | Inf. Process. Manag. | 2025-11-01 |
| 352 | VinDr-CXR-VQA: A Visual Question Answering Dataset for Explainable Chest X-Ray Analysis with Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present VinDr-CXR-VQA, a large-scale chest X-ray dataset for explainableMedical Visual Question Answering (Med-VQA) with spatial grounding. |
Hai-Dang Nguyen; Ha-Hieu Pham; Hao T. Nguyen; Huy-Hieu Pham; | arxiv-cs.CV | 2025-11-01 |
| 353 | FARSIQA: Faithful and Advanced RAG System for Islamic Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing Retrieval-Augmented Generation (RAG)systems, relying on simplistic single-pass pipelines, fall short on complex,multi-hop queries requiring multi-step reasoning and evidence aggregation. Toaddress this gap, we introduce FARSIQA, a novel, end-to-end system for FaithfulAdvanced Question Answering in the Persian Islamic domain. |
Mohammad Aghajani Asl; Behrooz Minaei Bidgoli; | arxiv-cs.CL | 2025-10-29 |
| 354 | Beyond Long Context: When Semantics Matter More Than Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Clinical Entity Augmented Retrieval (CLEAR)method, introduced by Lopez et al. 2025, uses entity aware retrieval andachieved improved performance with an F1 score of 0.90 versus 0.86 forembedding based retrieval, while using over 70 percent fewer tokens. |
Tarun Kumar Chawdhury; Jon D. Duke; | arxiv-cs.CL | 2025-10-29 |
| 355 | A Multimodal and Dynamically Updatable Benchmark for Aviation Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a multimodal, multi-level benchmark dataset tailored to aviation QA tasks, alongside an automated updating mechanism and a multi-dimensional evaluation framework. |
Liu He; Shuyan Liu; Xiaorui Qin; Ran An; Jianghui Zeng; | International Journal of Robotics and Automation Technology | 2025-10-29 |
| 356 | Adapting Small Language Models to Low-Resource Domains: A Case Study in Hindi Tourism QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a multi-stagefinetuning strategy to adapt lightweight language models to the Hindi tourismdomain by leveraging both original and synthetic training data. |
Sandipan Majhi; Paheli Bhattacharya; | arxiv-cs.CL | 2025-10-29 |
| 357 | A Knowledge Graph Enhancement Technique for HIPAA Compliant Health Question Answering in Personal Health Libraries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast, Knowledge Graphs (KGs) have proven effective for QA tasks, especially in extracting structured insights from text, but transforming free text into KGs often leads to information or context loss that can compromise answer accuracy. To overcome this challenge, we present a novel iterative and monotonic KG refinement technique that enriches knowledge representation without sacrificing contextual integrity. |
Hasan Jamil; | ACM Transactions on Computing for Healthcare | 2025-10-28 |
| 358 | MARA: A Multimodal Adaptive Retrieval-Augmented Framework for Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, current approaches rely on query-agnostic document representations that overlook salient content and use static top-k evidence selection, which fails to adapt to the uncertain distribution of relevant information. To address these limitations, we propose the Multimodal Adaptive Retrieval-Augmented (MARA) framework, which introduces query-adaptive mechanisms to both retrieval and generation. |
HUI WU et. al. | mm | 2025-10-27 |
| 359 | VQA2: Visual Question Answering for Video Quality Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Nevertheless, related work has not been explored in the video domain, leaving substantial room for improvement. To address this gap, we introduce the VQA² Instruction Dataset-the first visual question answering instruction dataset that focuses on video quality assessment. |
ZIHENG JIA et. al. | mm | 2025-10-27 |
| 360 | DMC3: Dual-Modal Counterfactual Contrastive Construction for Egocentric Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although existing methods have made progress through the paradigm of pre-training and fine-tuning, they ignore the unique challenges posed by the first-person perspective, such as understanding multiple events and recognizing hand-object interactions. To deal with these challenges, we propose a Dual-Modal Counterfactual Contrastive Construction (DMC3) framework, which contains an egocentric videoqa baseline, a counterfactual sample construction module and a counterfactual sample-involved contrastive optimization. |
Jiayi Zou; Chaofan Chen; Bing-Kun Bao; Changsheng Xu; | mm | 2025-10-27 |
| 361 | DR-VQA: Decompose-then-Reconstruct for Visual Question Answering in BLV Assistance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DR-VQA (Decompose-then-Reconstruct Visual Question Answering), a novel framework that balances user intent with visual facts. |
Bocheng Pan; Hailong Shi; Xingyu Gao; | mm | 2025-10-27 |
| 362 | IPQA: A Benchmark for Core Intent Identification in Personalized Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Since users do not explicitly state their prioritizedintents, we derive core intents from observable behavior patterns in answerselection, grounded in satisficing theory where users choose answers meetingtheir acceptance thresholds. |
Jieyong Kim; Maryam Amirizaniani; Soojin Yoon; Dongha Lee; | arxiv-cs.CL | 2025-10-27 |
| 363 | Valor32k-AVQA V2.0: Open-Ended Audio-Visual Question Answering Dataset and Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite growing interest in Audio-Visual Question Answering (AVQA), existing datasets often suffer from limited diversity, rigid formats, and insufficient integration of audio and visual modalities. To address these limitations, we introduce Valor32k-AVQA v2.0, a large-scale dataset containing 28,863 real-world videos and over 225,000 QA pairs, designed to support diverse and realistic multimodal understanding. |
Ines Riahi; Abduljalil Radman; Zixin Guo; Rachid Hedjam; Jorma Laaksonen; | mm | 2025-10-27 |
| 364 | Towards Complex Table Question Answering Over Tabular Data Lakes (Extended Version) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we systematically analyze how LLMs paired with table retrievers can answer queries over private tabular data lakes. |
Daniela Risis; Jan-Micha Bodensohn; Matthias Urban; Carsten Binnig; | Datenbank-Spektrum | 2025-10-27 |
| 365 | RSVLM-QA: A Benchmark Dataset for Remote Sensing Vision Language Model-based Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces Remote Sensing Vision Language Model Question Answering (RSVLM-QA) dataset, a new large-scale, content-rich VQA dataset for the RS domain. |
XING ZI et. al. | mm | 2025-10-27 |
| 366 | DMC$^3$: Dual-Modal Counterfactual Contrastive Construction for Egocentric Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To deal with thesechallenges, we propose a Dual-Modal Counterfactual Contrastive Construction(DMC$^3$) framework, which contains an egocentric videoqa baseline, acounterfactual sample construction module and a counterfactual sample-involvedcontrastive optimization. |
Jiayi Zou; Chaofan Chen; Bing-Kun Bao; Changsheng Xu; | arxiv-cs.CV | 2025-10-23 |
| 367 | GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose GlobalRAG, a reinforcementlearning framework designed to enhance global reasoning in multi-hop QA.GlobalRAG decomposes questions into subgoals, coordinates retrieval withreasoning, and refines evidence iteratively. |
JINCHANG LUO et. al. | arxiv-cs.CL | 2025-10-23 |
| 368 | Bridging Language Gaps with Adaptive RAG: Improving Indonesian Language Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To overcome the limitedavailability of Indonesian language dataset, our study employs machinetranslation as data augmentation approach. |
William Christian; Daniel Adamlu; Adrian Yu; Derwin Suhartono; | arxiv-cs.CL | 2025-10-23 |
| 369 | Task-guided Dynamic Visual Reasoning for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we propose a task-guided dynamic visual reasoning method for visual question answering, which models the spatiotemporal states of objects in dynamic scenes, decomposes the questions into task steps, and finally deduces reasoning on the established spatiotemporal dynamic scene graph neural network. |
Yao Cong; Hongwei Mo; | International Journal of Humanoid Robotics | 2025-10-23 |
| 370 | Multimedia-Aware Question Answering: A Review of Retrieval and Cross-Modal Reasoning Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we review recent advancements in QAsystems that integrate multimedia retrieval pipelines, focusing onarchitectures that align vision, language, and audio modalities with userqueries. |
Rahul Raja; Arpita Vats; | arxiv-cs.IR | 2025-10-23 |
| 371 | VLSP 2025 MLQA-TSR Challenge: Vietnamese Multimodal Legal Question Answering on Traffic Sign Regulation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents the VLSP 2025 MLQA-TSR – the multimodal legal questionanswering on traffic sign regulation shared task at VLSP 2025. |
SON T. LUU et. al. | arxiv-cs.CL | 2025-10-23 |
| 372 | Hierarchical Sequence Iteration for Heterogeneous Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Retrieval-augmented generation (RAG) remains brittle on multi-step questionsand heterogeneous evidence sources, trading accuracy against latency andtoken/tool budgets. This paper … |
Ruiyi Yang; Hao Xue; Imran Razzak; Hakim Hacid; Flora D. Salim; | arxiv-cs.CL | 2025-10-23 |
| 373 | Investigating LLM Capabilities on Long Context Comprehension for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We shed light into some of the evaluation aspects using amulti-faceted approach. |
Feras AlMannaa; Talia Tseriotou; Jenny Chim; Maria Liakata; | arxiv-cs.CL | 2025-10-21 |
| 374 | Interpretable Question Answering with Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a question answering system that operates exclusively ona knowledge graph retrieval without relying on retrieval augmented generation(RAG) with large language models (LLMs). |
Kartikeya Aneja; Manasvi Srivastava; Subhayan Das; Nagender Aneja; | arxiv-cs.CL | 2025-10-21 |
| 375 | From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Both issues can mislead reasoning and undermineanswer reliability. To address these challenges, we propose MedRGAG, a unifiedretrieval-generation augmented framework that seamlessly integrates externaland parametric knowledge for medical QA. |
Lei Li; Xiao Zhou; Yingying Zhang; Xian Wu; | arxiv-cs.CL | 2025-10-21 |
| 376 | That’s Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Building on prior question-answering (QA)research, we extend the investigation of knowledge conflicts to the realm ofcode generation. We propose a domain-agnostic framework for constructing andinterpreting such conflicts, along with a novel evaluation method and datasettailored to code conflict scenarios. |
JAESUNG BAE et. al. | arxiv-cs.CL | 2025-10-21 |
| 377 | IMB: An Italian Medical Benchmark for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present twocomprehensive Italian medical benchmarks: \textbf{IMB-QA}, containing 782,644patient-doctor conversations from 77 medical categories, and \textbf{IMB-MCQA},comprising 25,862 multiple-choice questions from medical specialtyexaminations. |
Antonio Romano; Giuseppe Riccio; Mariano Barone; Marco Postiglione; Vincenzo Moscato; | arxiv-cs.CL | 2025-10-21 |
| 378 | AV-Master: Dual-Path Comprehensive Perception Makes Better Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limits their reasoning capability in complex scenarios. Toaddress these challenges, we propose a novel framework named AV-Master. |
JIAYU ZHANG et. al. | arxiv-cs.CV | 2025-10-21 |
| 379 | Explainable Bilingual Medical-Question-Answering Model Using Ensemble Learning Technique Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study establishes a foundation for building multilingual healthcare information systems, promoting inclusive and equitable access to medical information. |
Abdul Rahaman Wahab Sait; Yazeed Alkhurayyif; | Electronics | 2025-10-21 |
| 380 | Object-centric Video Question Answering with Visual Grounding and Referring Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing models primarily focus on high-level comprehension and are limited to text-only responses, restricting the flexibility for object-centric, multi-round interactions. In this paper, we make three contributions:(i) we address these limitations by introducing a VideoLLM, termed as **RGA3**, capable of performing both object referring and grounding for video reasoning tasks in a multi-round conversational manner, i.e., allowing users to iteratively interact with videos using both textual and visual queries; (ii) we propose **STOM** (Spatial-Temporal Overlay Module), a novel approach that allows arbitrary visual prompts to be processed at any timestamp within a video;(iii) we present **VideoInfer**, a manually curated object-centric video instruction dataset featuring question-answering pairs that require reasoning. |
HAOCHEN WANG et. al. | iccv | 2025-10-20 |
| 381 | ETVA: Evaluation of Text-to-Video Alignment Via Fine-grained Question Generation and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing text-to-video alignment metrics like CLIPScore only generate coarse-grained scores without fine-grained alignment details, failing to align with human preference. To address this limitation, we propose ETVA, a novel Evaluation method of Text-to-Video Alignment via fine-grained question generation and answering. |
KAISI GUAN et. al. | iccv | 2025-10-20 |
| 382 | Overcoming Dual Drift for Continual Long-Tailed Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Continual Long-Tailed Visual Question Answering (CLT-VQA) and identify two critical challenges: inner-task prototype drift, where classifier prototypes become biased toward majority classes due to imbalanced data, and inter-task feature drift, where learned features shift over time, causing forgetting of previously learned knowledge. To address these challenges, we propose a unified dual-balance approach that integrates a Balanced Classifier Prototype (BCP) learning module and a Multi-modal Feature Alignment (MFA) module. |
Feifei Zhang; Zhihao Wang; Xi Zhang; Changsheng Xu; | iccv | 2025-10-20 |
| 383 | AVAM: A Universal Training-free Adaptive Visual Anchoring Embedded Into Multimodal Large Language Model for Multi-image Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a straightforward yet universal Adaptive Visual Anchoring strategy, which can be seamlessly integrated into existing MLLMs, offering significant accuracy improvements through adaptive compression. |
Kang Zeng; Guojin Zhong; Jintao Cheng; Jin Yuan; Zhiyong Li; | iccv | 2025-10-20 |
| 384 | HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a new task to benchmark human-in-scene understanding for embodied agents: Human-In-Scene Question Answering (HIS-QA). |
Jiahe Zhao; Ruibing Hou; Zejie Tian; Hong Chang; Shiguang Shan; | iccv | 2025-10-20 |
| 385 | Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present QUestion-only replay with Attention Distillation (QUAD), a novel approach for VQACL that leverages only past task questions for regularization. |
Imad Eddine Marouf; Enzo Tartaglione; Stéphane Lathuilière; Joost Van De Weijer; | iccv | 2025-10-20 |
| 386 | TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given a video and a question, we generate an open-ended answer grounded with the start and end time. For this task, we propose TOGA: a vision-language model for Temporally Grounded Open-Ended Video QA with Weak Supervision. |
AYUSH GUPTA et. al. | iccv | 2025-10-20 |
| 387 | PVChat: Personalized Video Chat with One-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we introduce an automated augmentation pipeline that synthesizes identity-preserving positive samples and retrieves hard negatives from existing video corpora, generating a diverse training dataset with four QA types: existence, appearance, action, and location inquiries. |
YUFEI SHI et. al. | iccv | 2025-10-20 |
| 388 | Beyond The Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To improve exploration efficiency, we propose Fine-EQA, a hybrid exploration model that integrates frontier-based and goal-oriented navigation to guide agents toward task-relevant regions more effectively. |
KAIXUAN JIANG et. al. | iccv | 2025-10-20 |
| 389 | ReasonVQA: A Multi-hop Reasoning Benchmark with Structural Knowledge for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new dataset, ReasonVQA, for the Visual Question Answering (VQA) task. |
Duong T. Tran; Trung-Kien Tran; Manfred Hauswirth; Danh Le Phuoc; | iccv | 2025-10-20 |
| 390 | Acknowledging Focus Ambiguity in Visual Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: No published work on visual question answering (VQA) accounts for ambiguity regarding where the content described in the question is located in the image. To fill this gap, we introduce VQ-FocusAmbiguity, the first VQA dataset that visually grounds each plausible image region a question could refer to when arriving at valid answers. |
Chongyan Chen; Yu-Yun Tseng; Zhuoheng Li; Anush Venkatesh; Danna Gurari; | iccv | 2025-10-20 |
| 391 | SMR-agents: Synergistic Medical Reasoning Agents for Zero-shot Medical Visual Question Answering with MLLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Dujuan Wang; Tao Cheng; Sutong Wang; Y. Chen; Yunqiang Yin; | Inf. Process. Manag. | |
| 392 | SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paperbridges the gap by presenting a novel framework. |
Xiongkun Linghu; Jiangyong Huang; Ziyu Zhu; Baoxiong Jia; Siyuan Huang; | arxiv-cs.CV | 2025-10-19 |
| 393 | AutoGraph-R1: End-to-End Reinforcement Learning for Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, itseffectiveness is hindered by a fundamental disconnect: the knowledge graph (KG)construction process is decoupled from its downstream application, yieldingsuboptimal graph structures. To bridge this gap, we introduce AutoGraph-R1, thefirst framework to directly optimize KG construction for task performance usingReinforcement Learning (RL). |
HONG TING TSANG et. al. | arxiv-cs.CL | 2025-10-17 |
| 394 | SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present SQuAI (https://squai.scads.ai/), a scalable and trustworthymulti-agent retrieval-augmented generation (RAG) framework for scientificquestion answering (QA) with large language models (LLMs). |
Ines Besrour; Jingbo He; Tobias Schreieder; Michael Färber; | arxiv-cs.IR | 2025-10-17 |
| 395 | Prompt Design for Medical Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
LEONID KULIGIN et. al. | Machine Learning with Applications | 2025-10-17 |
| 396 | DTKG: Dual-Track Knowledge Graph-Verified Reasoning Framework for Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These limitations deteriorate the efficiency andaccuracy for multi-hop QA tasks. To address this challenge, we propose a noveldual-track KG verification and reasoning framework DTKG, which is inspired bythe Dual Process Theory in cognitive science. |
CHANGHAO WANG et. al. | arxiv-cs.AI | 2025-10-17 |
| 397 | MedTrust-RAG: Evidence Verification and Trust Alignment for Biomedical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose MedTrust-Guided Iterative RAG, aframework designed to enhance factual consistency and mitigate hallucinationsin medical QA. |
YINGPENG NING et. al. | arxiv-cs.CL | 2025-10-16 |
| 398 | Interactive Environment-Aware Planning System and Dialogue for Social Robots in Early Childhood Education Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we propose an interactive environment-aware dialog and planning system for social robots in early childhood education, aimed at supporting the learning and social interaction of young children. |
Jiyoun Moon; Seung Min Song; | Applied Sciences | 2025-10-16 |
| 399 | Applications and Challenges of Retrieval-Augmented Generation (RAG) in Maternal Health: A Multi-Axial Review of The State of The Art in Biomedical QA with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this context, retrieval-augmented generation (RAG) systems provide a promising approach to enhance traceability, timeliness, and accuracy in tasks such as biomedical question answering (QA). This article presents a narrative and thematic review of the evolution of these technologies in maternal health, structured across five axes: technical foundations of RAG, advancements in biomedical LLMs, conversational agents in healthcare, clinical validation frameworks, and specific applications in obstetric telehealth. |
ADRIANA NOGUERA et. al. | Sci | 2025-10-16 |
| 400 | Finding Answers in Thought Matters: Revisiting Evaluation on Large Language Models with Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the other hand, for models requiringreasoning, the method of answer extraction plays a critical role. Our researchreveals that the performance of reasoning models and their final answerdistributions are highly sensitive to the answer extraction algorithm employed.In order to mitigate this, we propose a basic framework: Answer Regeneration.The method uses an additional model inference, providing the prior input andoutput prefaced by the prompt Answer:. |
HWIYEOL JO et. al. | arxiv-cs.CL | 2025-10-16 |
| 401 | PRISM: Agentic Retrieval with LLMs for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Weintroduce an Agentic Retrieval System that leverages large language models(LLMs) in a structured loop to retrieve relevant evidence with high precisionand recall. |
Md Mahadi Hasan Nahid; Davood Rafiei; | arxiv-cs.CL | 2025-10-16 |
| 402 | BioMedSearch: A Multi-Source Biomedical Retrieval Framework Based on LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To evaluate the accuracy ofquestion answering, we constructed a multi-level dataset, BioMedMCQs,consisting of 3,000 questions. |
CONGYING LIU et. al. | arxiv-cs.CL | 2025-10-15 |
| 403 | Who’s Asking? Evaluating LLM Robustness to Inquiry Personas in Factual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper,we present the first systematic evaluation of LLM robustness to inquirypersonas, i.e. user profiles that convey attributes like identity, expertise,or belief. |
Nil-Jana Akpinar; Chia-Jung Lee; Vanessa Murdock; Pietro Perona; | arxiv-cs.CL | 2025-10-14 |
| 404 | ESI: Epistemic Uncertainty Quantification Via Semantic-preserving Intervention for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we establish a connection between the uncertainty ofLLMs and their invariance under semantic-preserving intervention from a causalperspective. |
Mingda Li; Xinyu Li; Weinan Zhang; Longxuan Ma; | arxiv-cs.CL | 2025-10-14 |
| 405 | An Empirical Study for Representations of Videos in Video Question Answering Via MLLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present acomprehensive empirical study of video representation methods for VideoQA withMLLMs. |
Zhi Li; Yanan Wang; Hao Niu; Julio Vizcarra; Masato Taya; | arxiv-cs.IR | 2025-10-14 |
| 406 | Discrepancy Detection at The Data Level: Toward Consistent Multilingual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate MIND on a bilingualQA system in the maternal and infant health domain and release a dataset ofbilingual questions annotated for factual and cultural inconsistencies. |
LORENA CALVO-BARTOLOMÉ et. al. | arxiv-cs.CL | 2025-10-13 |
| 407 | VeritasFi: An Adaptable, Multi-tiered RAG Framework for Multi-modal Financial Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Retrieval-Augmented Generation (RAG) is becoming increasingly essential forQuestion Answering (QA) in the financial sector, where accurate andcontextually grounded insights from complex public disclosures are crucial.However, existing financial RAG systems face two significant challenges: (1)they struggle to process heterogeneous data formats, such as text, tables, andfigures; and (2) they encounter difficulties in balancing general-domainapplicability with company-specific adaptation. |
ZHENGHAN TAI et. al. | arxiv-cs.IR | 2025-10-12 |
| 408 | RIPRAG: Hack A Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, weinvestigate a more complex and realistic scenario: the attacker lacks knowledgeof the RAG system’s internal composition and implementation details, and theRAG system comprises components beyond a mere retriever. |
MENG XI et. al. | arxiv-cs.AI | 2025-10-11 |
| 409 | AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by how humans link informationassociatively, we propose AssoMem, a novel framework constructing anassociative memory graph that anchors dialogue utterances to automaticallyextracted clues. |
KAI ZHANG et. al. | arxiv-cs.CL | 2025-10-11 |
| 410 | LONGQAEVAL: Designing Reliable Evaluations of Long-Form Clinical QA Under Resource Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce LongQAEval, an evaluation framework and set ofevaluation recommendations for limited-resource and high-expertise settings.Based on physician annotations of 300 real patient questions answered byphysicians and LLMs, we compare coarse answer-level versus fine-grainedsentence-level evaluation over the dimensions of correctness, relevance, andsafety. |
Federica Bologna; Tiffany Pan; Matthew Wilkens; Yue Guo; Lucy Lu Wang; | arxiv-cs.CL | 2025-10-11 |
| 411 | NG-Router: Graph-Supervised Multi-Agent Collaboration for Nutrition Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To further address contextual overload, we propose agradient-based subgraph retrieval mechanism that identifies salient evidenceduring training, thereby enhancing multi-hop and relational reasoning.Extensive experiments across multiple benchmarks and backbone modelsdemonstrate that NG-Router consistently outperforms both single-agent andensemble baselines, offering a principled approach to domain-aware multi-agentreasoning for complex nutritional health tasks. |
KAIWEN SHI et. al. | arxiv-cs.CL | 2025-10-10 |
| 412 | Closing The Data-Efficiency Gap Between Autoregressive and Masked Diffusion LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thisproposed method successfully and drastically improves the data efficiency ofarLLM fine-tuning, effectively closing the performance gap with dLLMs. |
Xu Pan; Ely Hahami; Jingxuan Fan; Ziqian Xie; Haim Sompolinsky; | arxiv-cs.CL | 2025-10-10 |
| 413 | Research on Sem-RAG: A Corn Planting Knowledge Question-Answering Algorithm Based on Fine-Grained Semantic Information Retrieval Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in knowledge-intensive domains such as agriculture, hallucination and insufficient retrieval accuracy remain challenging. To address these issues, we propose Sem-RAG, a corn planting knowledge question-answering algorithm based on fine-grained semantic retrieval enhancement. |
Bing Bai; Xiaoyan Meng; Chenzi Zhao; | Applied Sciences | 2025-10-09 |
| 414 | IDQuAD: Infectious Disease Question and Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In conclusion, this study introduces IDQuAD as a foundational dataset for infectious disease research, demonstrating the effectiveness of fine-tuning LLMs and paving the way for future advances in dataset development and LLM refinement for infectious disease tasks. |
Soonchan Kwon; Sujeong Hur; Beakcheol Jang; | PLOS One | 2025-10-09 |
| 415 | AI Knowledge Assist: An Automated Approach for The Creation of Knowledge Bases for Conversational AI Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end,we introduce AI Knowledge Assist, a system that extracts knowledge in the formof question-answer (QA) pairs from historical customer-agent conversations toautomatically build a knowledge base. |
Md Tahmid Rahman Laskar; Julien Bouvier Tremblay; Xue-Yong Fu; Cheng Chen; Shashi Bhushan TN; | arxiv-cs.CL | 2025-10-09 |
| 416 | Test-Time Reasoners Are Strategic Multiple-Choice Test-Takers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In all, we challenge claims that partial-input success isalways a flaw, so we discuss how reasoning traces could separate problematicdata from less problematic reasoning. |
Nishant Balepur; Atrey Desai; Rachel Rudinger; | arxiv-cs.CL | 2025-10-09 |
| 417 | SUBQRAG: Sub-question Driven Dynamic Graph Rag Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Graph Retrieval-Augmented Generation (Graph RAG) effectively builds aknowledge graph (KG) to connect disparate facts across a large document corpus.However, this broad-view … |
JIAOYANG LI et. al. | arxiv-cs.CL | 2025-10-08 |
| 418 | EverydayMMQA: A Multilingual and Multimodal Framework for Culturally Grounded Spoken Visual QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large-scale multimodal models achieve strong results on tasks like VisualQuestion Answering (VQA), but they often fail when queries require culturallygrounded, everyday knowledge, particularly in low-resource and underrepresentedlanguages. To bridge this gap, we introduce Everyday Multimodal andMultilingual QA (EverydayMMQA), a framework for creating large-scale,culturally-grounded datasets for spoken and visual question answering (SVQA). |
FIROJ ALAM et. al. | arxiv-cs.CL | 2025-10-07 |
| 419 | Asking For It: Question-Answering for Predicting Rule Infractions in Online Content Moderation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Online communities rely on a mix of platform policies and community-authoredrules to define acceptable behavior and maintain order. However, these rulesvary widely across … |
Mattia Samory; Diana Pamfile; Andrew To; Shruti Phadke; | arxiv-cs.CY | 2025-10-07 |
| 420 | Multi-Hop Question Answering: When Can Humans Help, and Where Do They Struggle? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To better understand how humans might collaborate effectively withAI, we evaluate the performance of crowd workers on these individual reasoningsubtasks. We find that while humans excel at knowledge integration (97\%accuracy), they often fail to recognize when a question requires multi-hopreasoning (67\% accuracy). |
Jinyan Su; Claire Cardie; Jennifer Healey; | arxiv-cs.HC | 2025-10-06 |
| 421 | AgentRouter: A Knowledge-Graph-Guided LLM Router for Collaborative Multi-Agent Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose tAgentRouter, a framework thatformulates multi-agent QA as a knowledge-graph-guided routing problemsupervised by empirical performance signals. |
ZHEYUAN ZHANG et. al. | arxiv-cs.CL | 2025-10-06 |
| 422 | Video-in-the-Loop: Span-Grounded Long Video QA with Interleaved Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present \emph{Video-in-the-Loop} (ViTL), a two-stage long-video QAframework that preserves a fixed token budget by first \emph{localizing}question-relevant interval(s) with a low-fps skim and then \emph{answering} viaspan-aware reallocation of visual tokens at higher effective frame rate,emitting an interleaved output with both spans and the final option for directattribution. |
CHENDONG WANG et. al. | arxiv-cs.CV | 2025-10-05 |
| 423 | StepChain GraphRAG: Reasoning Over Knowledge Graphs for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, challengespersist in integrating iterative reasoning steps with external knowledgeretrieval. To address this, we introduce StepChain GraphRAG, a framework thatunites question decomposition with a Breadth-First Search (BFS) Reasoning Flowfor enhanced multi-hop QA. |
TENGJUN NI et. al. | arxiv-cs.CL | 2025-10-03 |
| 424 | LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce LEAML, a label-efficient adaptation framework thatleverages both scarce labeled VQA samples and abundant unlabeled images. |
Ci-Siang Lin; Min-Hung Chen; Yu-Yang Sheng; Yu-Chiang Frank Wang; | arxiv-cs.CV | 2025-10-03 |
| 425 | Knowledge Graph-Guided Multi-Agent Distillation for Reliable Industrial Question Answering with Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Industrial question-answering (QA) systems require higher safety andreliability than general-purpose dialogue models, as errors in high-riskscenarios such as equipment fault diagnosis can have severe consequences.Although multi-agent large language models enhance reasoning depth, they sufferfrom uncontrolled iterations and unverifiable outputs, and conventionaldistillation methods struggle to transfer collaborative reasoning capabilitiesto lightweight, deployable student models. |
Jiqun Pan; Zhenke Duan; Jiani Tu; Anzhi Cheng; Yanqing Wang; | arxiv-cs.CL | 2025-10-03 |
| 426 | Uncertainty As Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we focus on UQfor the contextual QA task and propose a theoretically grounded approach toquantify epistemic uncertainty. |
YAVUZ BAKMAN et. al. | arxiv-cs.CL | 2025-10-02 |
| 427 | AccurateRAG: A Framework for Building Accurate Retrieval-Augmented Question-Answering Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce AccurateRAG — a novel framework for constructinghigh-performance question-answering applications based on retrieval-augmentedgeneration (RAG). |
LINH THE NGUYEN et. al. | arxiv-cs.CL | 2025-10-02 |
| 428 | TAG-EQA: Text-And-Graph for Event Question Answering Via Structured Prompting Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce TAG-EQA (Text-And-Graph for Event QuestionAnswering), a prompting framework that injects causal event graphs into LLMinputs by converting structured relations into natural-language statements.TAG-EQA spans nine prompting configurations, combining three strategies(zero-shot, few-shot, chain-of-thought) with three input modalities (text-only,graph-only, text+graph), enabling a systematic analysis of when and howstructured knowledge aids inference. |
Maithili Kadam; Francis Ferraro; | arxiv-cs.CL | 2025-10-01 |
| 429 | One More Question Is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose Expert QuestionDecomposition (EQD), an approach designed to balance the use of domainknowledge with computational efficiency. |
Mengyu Wang; Sotirios Sabanis; Miguel de Carvalho; Shay B. Cohen; Tiejun Ma; | arxiv-cs.CL | 2025-10-01 |
| 430 | Question Answering System Based on The Combination of Large Language Model and Knowledge Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jihong Wang; Yichen Zhang; Wei Liu; | Applied Intelligence | 2025-10-01 |
| 431 | A Multimodal LLM Approach for Visual Question Answering on Multiparametric 3D Brain MRI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce mpLLM, a prompt-conditioned hierarchical mixture-of-experts(MoE) architecture for visual question answering over multi-parametric 3D brainMRI (mpMRI). |
ARVIND MURARI VEPA et. al. | arxiv-cs.CV | 2025-09-30 |
| 432 | Boosting Process-Correct CoT Reasoning By Modeling Solvability of Multiple-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study thisthrough multiple-choice question answering (MCQA), which provides a controlledsetting with fixed answer options. |
Raphael Schumann; Stefan Riezler; | arxiv-cs.AI | 2025-09-30 |
| 433 | RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the lack of publicly availableRAG-centric preference datasets and specialised RMs, we introduce RAGferee, amethodology that repurposes question-answering (QA) datasets into preferencepairs that prioritise groundedness over stylistic features, enabling thetraining of contextual RMs better suited to judging RAG responses. |
ANDREI C. COMAN et. al. | arxiv-cs.CL | 2025-09-30 |
| 434 | Beyond Isolated Facts: Synthesizing Narrative and Grounded Supervision for VideoQA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To move beyond thisparadigm, we introduce a framework to synthesize richer supervisory signals. |
JIANXIN LIANG et. al. | arxiv-cs.CV | 2025-09-29 |
| 435 | Saliency Guided Longitudinal Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a saliency-guided encoder-decoder for chestX-ray Diff-VQA that turns post-hoc saliency into actionable supervision. |
Jialin Wu; Xiaofeng Liu; | arxiv-cs.AI | 2025-09-29 |
| 436 | Can VLM Pseudo-Labels Train A Time-Series QA Model That Outperforms The VLM? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Alternatively, with recent advancements inlarge-scale models, vision-language models (VLMs) have demonstrated thepotential to analyze time-series signals in a zero-shot manner. In this paper,we propose a training approach that uses pseudo labels generated by a VLM.Although VLMs can produce incorrect labels, TSQA models can still beeffectively trained based on the property that deep neural networks areinherently robust to such noisy labels. |
Takuya Fujimura; Kota Dohi; Natsuo Yamashita; Yohei Kawaguchi; | arxiv-cs.LG | 2025-09-29 |
| 437 | Knowledge Extraction on Semi-Structured Content: Does It Remain Relevant for Question Answering in The Era of LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The advent of Large Language Models (LLMs) has significantly advancedweb-based Question Answering (QA) systems over semi-structured content, raisingquestions about the continued utility of knowledge extraction for questionanswering. This paper investigates the value of triple extraction in this newparadigm by extending an existing benchmark with knowledge extractionannotations and evaluating commercial and open-source LLMs of varying sizes.Our results show that web-scale knowledge extraction remains a challenging taskfor LLMs. |
KAI SUN et. al. | arxiv-cs.CL | 2025-09-29 |
| 438 | Do LLMs Understand Romanian Driving Laws? A Study on Multimodal and Fine-Tuned Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper evaluates Large Language Models (LLMs)on Romanian driving-law QA with explanation generation. |
Eduard Barbu; Adrian Marius Dumitran; | arxiv-cs.CL | 2025-09-28 |
| 439 | JGU Mainz’s Submission to The WMT25 Shared Task on LLMs with Limited Resources for Slavic Languages: MT and QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents the JGU Mainz submission to the WMT25 Shared Task on LLMswith Limited Resources for Slavic Languages: Machine Translation and QuestionAnswering, focusing on Ukrainian, Upper Sorbian, and Lower Sorbian. |
Hossain Shaikh Saadi; Minh Duc Bui; Mario Sanz-Guerrero; Katharina von der Wense; | arxiv-cs.CL | 2025-09-26 |
| 440 | From Evidence to Trajectory: Abductive Reasoning Path Synthesis for Training Retrieval-Augmented Generation Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, wepropose EviPath, an evidence-anchored reasoning path synthesis paradigm for RAGagent development. |
MUZHI LI et. al. | arxiv-cs.CL | 2025-09-26 |
| 441 | Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this,we introduce Composite Reasoning (CR), a novel reasoning approach empoweringLLMs to dynamically explore and combine multiple reasoning styles likedeductive, inductive, and abductive for more nuanced problem-solving. |
Zishan Ahmad; Saisubramaniam Gopalakrishnan; | arxiv-cs.CL | 2025-09-26 |
| 442 | Do LLM Agents Know How to Ground, Recover, and Assess? A Benchmark for Epistemic Competence in Information-Seeking Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent work has explored training Large Language Model (LLM) search agentswith reinforcement learning (RL) for open-domain question answering (QA). |
Jiaqi Shao; Yuxiang Lin; Munish Prasad Lohani; Yufeng Miao; Bing Luo; | arxiv-cs.AI | 2025-09-26 |
| 443 | MIRAGE: Multi-hop Reasoning with Ambiguity Evaluation for Illusory Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Toestablish a robust baseline, we propose CLarifying Ambiguity with a Reasoningand InstructiON (CLARION), a multi-agent framework that significantlyoutperforms existing approaches on MIRAGE, paving the way for more adaptive androbust reasoning systems. |
JEONGHYUN PARK et. al. | arxiv-cs.CL | 2025-09-26 |
| 444 | A Comprehensive Evaluation of Transformer-Based Question Answering Models and RAG-Enhanced Design Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a comprehensive evaluation ofretrieval strategies for multi-hop question answering within aretrieval-augmented generation framework. |
Zichen Zhang; Kunlong Zhang; Hongwei Ruan; Yiming Luo; | arxiv-cs.CV | 2025-09-26 |
| 445 | Detecting (Un)answerability in Large Language Models with Linear Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thiswork, we study the problem of (un)answerability detection, focusing onextractive question answering (QA) where the model should determine if apassage contains sufficient information to answer a given question. |
Maor Juliet Lavi; Tova Milo; Mor Geva; | arxiv-cs.CL | 2025-09-26 |
| 446 | Beyond Stars: Bridging The Gap Between Ratings and Review Sentiment with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present an advanced approach to mobile app review analysis aimed ataddressing limitations inherent in traditional star-rating systems. |
Najla Zuhir; Amna Mohammad Salim; Parvathy Premkumar; Moshiur Farazi; | arxiv-cs.AI | 2025-09-25 |
| 447 | SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, thecaptions frequently include excessive noise irrelevant to the question, andLLMs generally do not comprehend VQA tasks, limiting their reasoningcapabilities. To address this issue, we propose the Summarized Caption-RerankAugmented VQA (SCRA-VQA), which employs a pre-trained visual language model toconvert images into captions. |
YAN ZHANG et. al. | arxiv-cs.CV | 2025-09-25 |
| 448 | LOCA: Logical Chain Augmentation for Scientific Corpus Cleaning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existingscientific question-answering (QA) datasets suffer from high error rates,frequently resulting from logical leaps and implicit reasoning within theanswers. To address this issue, we introduce LOCA (Logical Chain Augmentation),a novel framework for automatically cleaning scientific corpora, implementedthrough an augment-and-review loop. |
YOU-LE FANG et. al. | arxiv-cs.CL | 2025-09-24 |
| 449 | RJE: A Retrieval-Judgment-Exploration Framework for Efficient Knowledge Graph Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theselimitations, we propose Retrieval-Judgment-Exploration (RJE), a framework thatretrieves refined reasoning paths, evaluates their sufficiency, andconditionally explores additional evidence. |
CAN LIN et. al. | arxiv-cs.CL | 2025-09-24 |
| 450 | Are Smaller Open-Weight LLMs Closing The Gap to Proprietary Models for Biomedical Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we compare several open-weight modelsagainst top-performing systems such as GPT-4o, GPT-4.1, Claude 3.5 Sonnet, andClaude 3.7 Sonnet. |
Damian Stachura; Joanna Konieczna; Artur Nowak; | arxiv-cs.CL | 2025-09-23 |
| 451 | Consistency-Aware Parameter-Preserving Knowledge Editing Framework for Multi-Hop Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Parameter-Preserving Knowledge Editing (PPKE) enables updating models withnew or corrected information without retraining or parameter adjustment. RecentPPKE approaches based on … |
Lingwen Deng; Yifei Han; Long Zhang; Yue Du; Bin Li; | arxiv-cs.CL | 2025-09-23 |
| 452 | Pathways of Thoughts: Multi-Directional Thinking for Long-form Personalized Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, personalized QA remains relatively underexplored due tochallenges such as inferring preferences from long, noisy, and implicitcontexts, and generating responses that are simultaneously correct,contextually appropriate, and aligned with user expectations and backgroundknowledge. To address these challenges, we propose Pathways of Thoughts (PoT),an inference-stage method that applies to any large language model (LLM)without requiring task-specific fine-tuning. |
ALIREZA SALEMI et. al. | arxiv-cs.CL | 2025-09-23 |
| 453 | NeuS-QA: Grounding Long-Form Video Understanding in Temporal Logic and Neuro-Symbolic Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As aresult, there are no formal guarantees that the sampled context actuallyencodes the compositional or causal logic demanded by the question. To addressthese foundational gaps, we introduce NeuS-QA, a training-free, plug-and-playneuro-symbolic pipeline for LVQA. |
SAHIL SHAH et. al. | arxiv-cs.CV | 2025-09-22 |
| 454 | Semantic Reformulation Entropy for Robust Hallucination Detection in QA Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Semantic Reformulation Entropy (SRE), whichimproves uncertainty estimation in two ways. |
CHAODONG TONG et. al. | arxiv-cs.CL | 2025-09-22 |
| 455 | Memory-QA: Answering Recall Questions Based on Multimodal Memories Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This task poses unique challenges, including the creation oftask-oriented memories, the effective utilization of temporal and locationinformation within memories, and the ability to draw upon multiple memories toanswer a recall question. To address these challenges, we propose acomprehensive pipeline, Pensieve, integrating memory-specific augmentation,time- and location-aware multi-signal retrieval, and multi-memory QAfine-tuning. |
HONGDA JIANG et. al. | arxiv-cs.AI | 2025-09-22 |
| 456 | Towards Adaptive Context Management for Intelligent Conversational Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This particular paper introduces an Adaptive Context Management (ACM)framework for the Conversational Question Answering (ConvQA) systems. |
Manoj Madushanka Perera; Adnan Mahmood; Kasun Eranda Wijethilake; Quan Z. Sheng; | arxiv-cs.CL | 2025-09-22 |
| 457 | AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose AirQA, a human-annotatedcomprehensive paper QA dataset in the field of artificial intelligence (AI),with 13,948 papers and 1,246 questions, that encompasses multi-task,multi-modal and instance-level evaluation. |
TIANCHENG HUANG et. al. | arxiv-cs.CL | 2025-09-21 |
| 458 | LLaVul: A Multimodal LLM for Interpretable Vulnerability Reasoning About Source Code Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our model is trained to integrate paired codeand natural queries into a unified space, enhancing reasoning andcontext-dependent insights about code vulnerability. |
Ala Jararweh; Michael Adams; Avinash Sahu; Abdullah Mueen; Afsah Anwar; | arxiv-cs.AI | 2025-09-21 |
| 459 | Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent speech-LLMs have shown impressive performance in tasks liketranscription and translation, yet they remain limited in understanding theparalinguistic aspects of speech … |
QIONGQIONG WANG et. al. | arxiv-cs.CL | 2025-09-20 |
| 460 | Question Answering with LLMs and Learning from Answer Sets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce LLM2LAS, ahybrid system that effectively combines the natural language understandingcapabilities of LLMs, the rule induction power of the Learning from Answer Sets(LAS) system ILASP, and the formal reasoning strengths of Answer SetProgramming (ASP). |
MANUEL BORROTO et. al. | arxiv-cs.AI | 2025-09-20 |
| 461 | Comparing RAG and GraphRAG for Page-Level Retrieval Question Answering on Math Textbook Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Overall, this study highlights both the promises andchallenges of page-level retrieval systems in educational contexts, emphasizingthe need for more refined retrieval methods to build reliable AI tutoringsolutions in providing reference page numbers. |
EASON CHEN et. al. | arxiv-cs.IR | 2025-09-20 |
| 462 | Time to Revist Exact Match Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce TempAnswerQA, abenchmark distilled from Test of Time and TempTabQA, where all questionsrequire a numerical, temporal answer, allowing us to evaluate models beyond EM.We use the forecasting metrics symmetric mean absolute percentage error (sMAPE)and mean absolute scaled error (MASE). |
Auss Abbood; Zaiqiao Meng; Nigel Collier; | arxiv-cs.CL | 2025-09-20 |
| 463 | Jamendo-QA: A Large-Scale Music Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Jamendo-QA, a large-scale dataset for Music Question Answering(Music-QA). |
Junyoung Koh; Soo Yong Kim; Yongwon Choi; Gyu Hyeong Choi; | arxiv-cs.MM | 2025-09-19 |
| 464 | SWE-QA: Can Language Models Answer Repository-level Code Questions? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thispaper, we present SWE-QA, a repository-level code question answering (QA)benchmark designed to facilitate research on automated QA systems in realisticcode environments. |
WEIHAN PENG et. al. | arxiv-cs.CL | 2025-09-18 |
| 465 | Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Notably, generatingvalid uncertainty estimates for natural language explanations is particularlychallenging due to the auto-regressive generation process of LLMs and thepresence of noise in medical inquiries. To bridge this gap, in this work, wefirst propose a novel uncertainty estimation framework for these generatednatural language explanations, which provides valid uncertainty guarantees in apost-hoc and model-agnostic manner. |
Yangyi Li; Mengdi Huai; | arxiv-cs.CL | 2025-09-18 |
| 466 | HistoryBankQA: Multilingual Temporal Question Answering on Historical Events Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing temporal reasoning datasetsare limited in scale, lack multilingual coverage and focus more on contemporaryevents. To address these limitations, we present HistoryBank, a multilingualdatabase of 10M+ historical events extracted from Wikipedia timeline pages andarticle infoboxes. |
Biswadip Mandal; Anant Khandelwal; Manish Gupta; | arxiv-cs.CL | 2025-09-16 |
| 467 | AQUA-LLM: Evaluating Accuracy, Quantization, and Adversarial Robustness Trade-offs in LLMs for Cybersecurity Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose AQUA-LLM, an evaluationframework designed to benchmark several state-of-the-art small LLMs under fourdistinct configurations: base, quantized-only, fine-tuned, and fine-tunedcombined with quantization, specifically for cybersecurity QA. |
Onat Gungor; Roshan Sood; Harold Wang; Tajana Rosing; | arxiv-cs.CR | 2025-09-16 |
| 468 | Graph-Enhanced Retrieval-Augmented Question Answering for E-Commerce Customer Support Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper develops a novelretrieval-augmented generation (RAG) framework that uses knowledge graphs (KGs)to improve the relevance of the answer and the factual grounding. |
Piyushkumar Patel; | arxiv-cs.CL | 2025-09-15 |
| 469 | Bridging Vision Language Models and Symbolic Grounding for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study symbolic scene graphs(SGs) as intermediate grounding signals for VQA. |
Haodi Ma; Vyom Pathak; Daisy Zhe Wang; | arxiv-cs.CV | 2025-09-15 |
| 470 | ParaEQsA: Parallel and Asynchronous Embodied Questions Scheduling and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper formulates the Embodied Questions Answering (EQsA) problem,introduces a corresponding benchmark, and proposes a system to tackle theproblem. |
Haisheng Wang; Weiming Zhi; | arxiv-cs.RO | 2025-09-15 |
| 471 | MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thiswork, we introduce MORQA (Medical Open-Response QA), a new multilingualbenchmark designed to assess the effectiveness of NLG evaluation metrics acrossthree medical visual and text-based QA datasets in English and Chinese. |
WEN-WAI YIM et. al. | arxiv-cs.CL | 2025-09-15 |
| 472 | FineQuest: Adaptive Knowledge-Assisted Sports Video Understanding Via Agent-of-Thoughts Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work,we propose FineQuest, the first training-free framework that leveragesdual-mode reasoning inspired by cognitive science: i) Reactive Reasoning forstraightforward sports queries and ii) Deliberative Reasoning for more complexones. |
Haodong Chen; Haojian Huang; XinXiang Yin; Dian Shao; | arxiv-cs.CV | 2025-09-15 |
| 473 | AgenticIE: An Adaptive Agent for Information Extraction from Complex Regulatory Documents Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Declaration of Performance (DoP) documents, mandated by EU regulation,certify the performance of construction products. There are two challenges tomake DoPs machine and human … |
Gaye Colakoglu; Gürkan Solmaz; Jonathan Fürst; | arxiv-cs.CL | 2025-09-15 |
| 474 | Improving LLMs’ Learning for Coreference Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, weinvestigate the limitations of existing LLM-based approaches to CR-specificallythe Question-Answering (QA) Template and Document Template methods and proposetwo novel techniques: Reversed Training with Joint Inference and IterativeDocument Generation. |
Yujian Gan; Yuan Liang; Yanni Lin; Juntao Yu; Massimo Poesio; | arxiv-cs.CL | 2025-09-14 |
| 475 | !MSA at AraHealthQA 2025 Shared Task: Enhancing LLM Performance for Arabic Clinical Question Answering Through Prompt Engineering and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present our systems for Track 2 (General Arabic Health QA, MedArabiQ) ofthe AraHealthQA-2025 shared task, where our methodology secured 2nd place inboth Sub-Task 1 (multiple-choice question answering) and Sub-Task 2 (open-endedquestion answering) in Arabic clinical contexts. |
Mohamed Tarek; Seif Ahmed; Mohamed Basem; | arxiv-cs.CL | 2025-09-14 |
| 476 | Constructing A Question-Answering Simulator Through The Distillation of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, wepropose a method named LLM Distillation based Simulator (LDSim), which distillsdomain knowledge and reasoning capability from an LLM to better assistprediction, thereby improving simulation performance. |
Haipeng Liu; Ting Long; Jing Fu; | arxiv-cs.LG | 2025-09-11 |
| 477 | Agentic LLMs for Question Answering Over Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a Natural Language to SQL (NL-to-SQL) approachleveraging large language models (LLMs) such as GPT-4o, GPT-4o-mini, andDeepSeek v2:16b to generate SQL queries dynamically. |
Rishit Tyagi; Mohit Gupta; Rahul Bouri; | arxiv-cs.CL | 2025-09-11 |
| 478 | A Knowledge Noise Mitigation Framework for Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Knowledge-based visual question answering (KB-VQA) requires a model tounderstand images and utilize external knowledge to provide accurate answers.Existing approaches often directly augment models with retrieved informationfrom knowledge sources while ignoring substantial knowledge redundancy, whichintroduces noise into the answering process. To address this, we propose atraining-free framework with knowledge focusing for KB-VQA, that mitigates theimpact of noise by enhancing knowledge relevance and reducing redundancy.First, for knowledge retrieval, our framework concludes essential parts fromthe image-question pairs, creating low-noise queries that enhance the retrievalof highly relevant knowledge. |
Zhiyue Liu; Sihang Liu; Jinyuan Liu; Xinru Zhang; | arxiv-cs.CV | 2025-09-11 |
| 479 | Retrieval-Augmented Generation for Reliable Interpretation of Radio Regulations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study question answering in the domain of radio regulations, a legallysensitive and high-stakes area. We propose a telecom-specificRetrieval-Augmented Generation (RAG) pipeline and introduce, to our knowledge,the first multiple-choice evaluation set for this domain, constructed fromauthoritative sources using automated filtering and human validation. |
Zakaria El Kassimi; Fares Fourati; Mohamed-Slim Alouini; | arxiv-cs.IR | 2025-09-11 |
| 480 | Fusing Knowledge and Language: A Comparative Study of Knowledge Graph-Based Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While traditional Retrieval Augmented Generation(RAG) approaches are proficient in fact-based and local context-basedextraction from concise texts, they encounter limitations when addressing thethematic and holistic understanding of complex, extensive texts, requiring adeeper analysis of both text and context. This paper presents a comprehensivetechnical comparative study of three different methodologies for constructingknowledge graph triplets and integrating them with Large Language Models (LLMs)for question answering: spaCy, Stanford CoreNLP-OpenIE, and GraphRAG, allleveraging open source technologies. |
Vaibhav Chaudhary; Neha Soni; Narotam Singh; Amita Kapoor; | arxiv-cs.AI | 2025-09-11 |
| 481 | Enhancing Ancient Ceramic Knowledge Services: A Question Answering System Using Fine-Tuned Models and GraphRAG Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To address the challenges of extensive domain expertise and deficient semantic comprehension in the digital preservation of ancient ceramics, this paper proposes a knowledge … |
Zhi Chen; Bingxiang Liu; | Inf. | 2025-09-11 |
| 482 | LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore how largelanguage models (LLMs) can be used for information retrieval (IR), and anensemble of zero-shot models can accomplish state-of-the-art performance on adomain-specific Yes/No QA task. |
Dima Galat; Diego Molla-Aliod; | arxiv-cs.CL | 2025-09-10 |
| 483 | A Role-Aware Multi-Agent Framework for Financial Education Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluated our framework on a set of3,532 expert-designed finance education questions from Study.com, an onlinelearning platform. |
Andy Zhu; Yingjun Du; | arxiv-cs.CL | 2025-09-10 |
| 484 | TextlessRAG: End-to-End Visual Document RAG By Speech Without Text Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose TextlessRAG, thefirst end-to-end framework for speech-based question answering over large-scaledocument images. |
PEIJIN XIE et. al. | arxiv-cs.CV | 2025-09-09 |
| 485 | Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thisissue, we propose a novel Iterative Retrieval-Augmented Knowledge Editingmethod with guided decomposition (IRAKE) through the guidance from singleedited facts and entire edited cases. |
Yi Liu; Xiangrong Zhu; Xiangyu Liu; Wei Wei; Wei Hu; | arxiv-cs.CL | 2025-09-09 |
| 486 | The Role of Exploration Modules in Small Language Models for Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thisstudy, we investigate the capabilities of existing integration methods forsmall language models (SLMs) in KG-based question answering and observe thattheir performance is often constrained by their limited ability to traverse andreason over knowledge graphs. To address this limitation, we propose leveragingsimple and efficient exploration modules to handle knowledge graph traversal inplace of the language model itself. |
Yi-Jie Cheng; Oscar Chew; Yun-Nung Chen; | arxiv-cs.CL | 2025-09-09 |
| 487 | RTLExplain: A Structured Approach to RTL Code Summarization and Question Answering for Medium-to-Large Designs Using LLMs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Models (LLMs) show promise in assisting with Register Transfer Level (RTL) design tasks, including code summarization, documentation, and question answering. … |
TING-HSUN CHI et. al. | 2025 ACM/IEEE 7th Symposium on Machine Learning for CAD … | 2025-09-08 |
| 488 | Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: WhenLLMs memorize outdated medical knowledge, they can provide harmful advice orfail at clinical reasoning tasks. To investigate this problem, we introduce twonovel question-answering (QA) datasets derived from systematic reviews:MedRevQA (16,501 QA pairs covering general biomedical knowledge) andMedChangeQA (a subset of 512 QA pairs where medical consensus has changed overtime). |
Juraj Vladika; Mahdi Dhaini; Florian Matthes; | arxiv-cs.CL | 2025-09-04 |
| 489 | KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present KERAG, a novel KG-based RAG pipeline thatenhances QA coverage by retrieving a broader subgraph likely to containrelevant information. |
YUSHI SUN et. al. | arxiv-cs.CL | 2025-09-04 |
| 490 | CMRAG: Co-modality-based Visual Document Retrieval and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods have limitations whendealing with multimodal documents: one category of methods relies on layoutanalysis and text extraction, which can only utilize explicit text informationand struggle to capture images or unstructured content; the other categorytreats document segmentation as visual input and directly passes it to visuallanguage models (VLMs) for processing, yet it ignores the semantic advantagesof text, leading to suboptimal retrieval and generation results. To addressthese research gaps, we propose the Co-Modality-based RAG (CMRAG) framework,which can simultaneously leverage texts and images for more accurate retrievaland generation. |
WANG CHEN et. al. | arxiv-cs.CL | 2025-09-02 |
| 491 | Bio-inspired Product Design System Integrating Retrieval-augmented Question Answering and Semantic Fusion Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xinhui Kang; Wenjie You; Ying Luo; | Adv. Eng. Informatics | 2025-09-01 |
| 492 | TEQA: Temporal Knowledge Graph Enhanced Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Qian Liu; Siling Feng; Mengxing Huang; | Knowl. Based Syst. | 2025-09-01 |
| 493 | Understanding Question-answering Systems: Evolution, Applications, Trends, and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Amer Farea; Frank Emmert-Streib; | Eng. Appl. Artif. Intell. | 2025-09-01 |
| 494 | TreeQA: Enhanced LLM-RAG with Logic Tree Reasoning for Reliable and Interpretable Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
XIANGRUI ZHANG et. al. | Knowl. Based Syst. | 2025-09-01 |
| 495 | LOSDF: A Logical Optimization and Semantic Decoupling Framework for Question Answering in Multi-party Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SHU ZHOU et. al. | Inf. Process. Manag. | 2025-09-01 |
| 496 | Decomposing and Revising What Language Models Generate Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the generatedquestions are often irrelevant and incomplete, resulting in a loss of facts inretrieval.These approaches also fail to aggregate evidence snippets fromdifferent documents and paragraphs. To tackle these problems, we propose a newfact decomposition-based framework called FIDES (\textit{faithful contextenhanced fact decomposition and evidence aggregation}) for attributed QA. |
ZHICHAO YAN et. al. | arxiv-cs.CL | 2025-08-31 |
| 497 | CaresAI at BioCreative IX Track 1 — LLM for Biomedical QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a two-stage inference pipeline for precise short-answerextraction to mitigate verbosity and improve alignment with evaluation metrics.Despite partial improvements, challenges persist in generating strictlyformatted outputs. |
Reem Abdel-Salam; Mary Adewunmi; Modinat A. Abayomi; | arxiv-cs.CL | 2025-08-31 |
| 498 | Geospatial Question Answering on Historical Maps Using Spatio-Temporal Knowledge Graphs and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thisproject, we developed a GeoQA system by integrating a spatio-temporal knowledgegraph (KG) constructed from historical map data with large language models(LLMs). |
Ziyi Liu; Sidi Wu; Lorenz Hurni; | arxiv-cs.IR | 2025-08-29 |
| 499 | Benchmarking GPT-5 for Biomedical Natural Language Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The rapid expansion of biomedical literature has heightened the need for scalable natural language processing (NLP) solutions. While GPT-4 substantially narrowed the gap with … |
Yu Hou; Zaifu Zhan; Rui Zhang; | ArXiv | 2025-08-28 |
| 500 | Overview of BioASQ 2025: The Thirteenth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This is an overview of the thirteenth edition of the BioASQ challenge in thecontext of the Conference and Labs of the Evaluation Forum (CLEF) 2025. BioASQis a series of … |
ANASTASIOS NENTIDIS et. al. | arxiv-cs.CL | 2025-08-28 |
| 501 | Overview of BioASQ 2024: The Twelfth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This is an overview of the twelfth edition of the BioASQ challenge in thecontext of the Conference and Labs of the Evaluation Forum (CLEF) 2024. BioASQis a series of international … |
ANASTASIOS NENTIDIS et. al. | arxiv-cs.CL | 2025-08-28 |
| 502 | AraHealthQA 2025: The First Shared Task on Arabic Health Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce AraHealthQA 2025, the Comprehensive Arabic Health QuestionAnswering Shared Task, held in conjunction with ArabicNLP 2025 (co-located withEMNLP 2025). |
HASSAN ALHUZALI et. al. | arxiv-cs.CL | 2025-08-27 |
| 503 | AI-SearchPlanner: Modular Agentic Search Via Pareto-Optimal Multi-Objective Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose\textbf{AI-SearchPlanner}, a novel reinforcement learning framework designed toenhance the performance of frozen QA models by focusing on search planning.Specifically, our approach introduces three key innovations: 1) Decoupling theArchitecture of the Search Planner and Generator, 2) Dual-Reward Alignment forSearch Planning, and 3) Pareto Optimization of Planning Utility and Cost, toachieve the objectives. |
Lang Mei; Zhihan Yang; Chong Chen; | arxiv-cs.AI | 2025-08-27 |
| 504 | Extracting Information from Scientific Literature Via Visual Table Question Answering Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study explores three approaches to processing table data in scientificpapers to enhance extractive question answering and develop a software tool forthe systematic review process. |
Dongyoun Kim; Hyung-do Choi; Youngsun Jang; John Kim; | arxiv-cs.IR | 2025-08-26 |
| 505 | Knowing or Guessing? Robust Medical Visual Question Answering Via Joint Consistency and Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: When evaluatingstate-of-the-art (SOTA) models like LLaVA-Med on RoMed, we observe alarmingperformance drops (e.g., a 40\% decline in Recall) compared to original VQAbenchmarks, exposing critical robustness gaps. |
SONGTAO JIANG et. al. | arxiv-cs.CL | 2025-08-26 |
| 506 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To compileMQAD, our methodology leverages specialized Music Information Retrieval (MIR)models to extract higher-level musical features and Large Language Models(LLMs) to generate natural language QA pairs. |
ZHIHAO OUYANG et. al. | arxiv-cs.SD | 2025-08-26 |
| 507 | Chronological Passage Assembling in RAG Framework for Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, understanding narrative texts requires morethan isolated segments, as the broader context and sequential relationshipsbetween segments are crucial for comprehension. To address these limitations,we propose ChronoRAG, a novel RAG framework specialized for narrative texts.This approach focuses on two essential aspects: refining dispersed documentinformation into coherent and structured passages and preserving narrative flowby explicitly capturing and maintaining the temporal order among retrievedpassages. |
Byeongjeong Kim; Jeonghyun Park; Joonho Yang; Hwanhee Lee; | arxiv-cs.CL | 2025-08-26 |
| 508 | Can Out-of-Distribution Evaluations Uncover Reliance on Shortcuts? A Case Study in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despitetheir practicality, such evaluations build upon a strong assumption: that OODevaluations can capture and reflect upon possible failures in a real-worlddeployment. In this work, we challenge this assumption and confront the results obtainedfrom OOD evaluations with a set of specific failure modes documented inexisting question-answering (QA) models, referred to as a reliance on spuriousfeatures or prediction shortcuts. |
Michal Štefánik; Timothee Mickus; Marek Kadlčík; Michal Spiegel; Josef Kuchař; | arxiv-cs.CL | 2025-08-25 |
| 509 | AVAM: Universal Training-free Adaptive Visual Anchoring Embedded Into Multimodal Large Language Model for Multi-image Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper,we propose a straightforward yet universal Adaptive Visual Anchoring strategy,which can be seamlessly integrated into existing MLLMs, offering significantaccuracy improvements through adaptive compression. |
Kang Zeng; Guojin Zhong; Jintao Cheng; Jin Yuan; Zhiyong Li; | arxiv-cs.CV | 2025-08-25 |
| 510 | ST-Raptor: LLM-Powered Semi-Structured Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Second, methods likeNL2Code and multi-modal LLM QA struggle to understand the complex layouts ofsemi-structured tables and cannot accurately answer corresponding questions. Tothis end, we propose ST-Raptor, a tree-based framework for semi-structuredtable question answering using large language models. |
ZIRUI TANG et. al. | arxiv-cs.AI | 2025-08-25 |
| 511 | Agri-Query: A Case Study on RAG Vs. Long-Context LLMs for Cross-Lingual Technical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a case study evaluating large language models (LLMs) with128K-token context windows on a technical question answering (QA) task. |
Julius Gun; Timo Oksanen; | arxiv-cs.CL | 2025-08-25 |
| 512 | Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces Omne-R1, a novel approach designed to enhance multi-hopquestion answering capabilities on schema-free knowledge graphs by integratingadvanced reasoning models. |
BOYUAN LIU et. al. | arxiv-cs.CL | 2025-08-24 |
| 513 | PlantVillageVQA: A Visual Question Answering Dataset for Benchmarking Vision-Language Models in Plant Science Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our objective remains to provide a publicly available, standardisedand expert-verified database to enhance diagnostic accuracy for plant diseaseidentifications and advance scientific research in the agricultural domain. |
Syed Nazmus Sakib; Nafiul Haque; Mohammad Zabed Hossain; Shifat E. Arman; | arxiv-cs.CV | 2025-08-23 |
| 514 | Distilcyphergpt: Enhancing Large Language Models for Knowledge Graph Question Answering in Cypher Through Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
You Li Chong; Chin-Poo Lee; Ming Kim Lim; | Data Mining and Knowledge Discovery | 2025-08-23 |
| 515 | PediatricsMQA: A Multi-modal Pediatrics Question Answering Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Evaluatingstate-of-the-art open models, we find dramatic performance drops in youngercohorts, highlighting the need for age-aware methods to ensure equitable AIsupport in pediatric care. |
Adil Bahaj; Oumaima Fadi; Mohamed Chetouani; Mounir Ghogho; | arxiv-cs.CY | 2025-08-22 |
| 516 | MizanQA: Benchmarking Large Language Models on Moroccan Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces MizanQA (pronounced Mizan,meaning scale in Arabic, a universal symbol of justice), a benchmark designedto evaluate LLMs on Moroccan legal question answering (QA) tasks, characterisedby rich linguistic and legal complexity. |
Adil Bahaj; Mounir Ghogho; | arxiv-cs.CL | 2025-08-22 |
| 517 | DSRAG: A Domain-Specific Retrieval Framework Based on Document-derived Multimodal Knowledge Graph Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Current general-purpose large language models (LLMs) commonly exhibit knowledge hallucination and insufficient domain-specific adaptability in domain-specific tasks, limiting … |
MENGZHENG YANG et. al. | ArXiv | 2025-08-22 |
| 518 | Hierarchical Vision-Language Reasoning for Multimodal Multiple-Choice Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Multimodal Large Language Models (MLLMs) have demonstrated remarkablemultimodal understanding capabilities in Visual Question Answering (VQA) tasksby integrating visual and textual features. |
AO ZHOU et. al. | arxiv-cs.IR | 2025-08-22 |
| 519 | XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theselimitations, we introduce XLQA, a novel benchmark explicitly designed forlocale-sensitive multilingual ODQA. |
Keon-Woo Roh; Yeong-Joon Ju; Seong-Whan Lee; | arxiv-cs.CL | 2025-08-22 |
| 520 | MedQARo: A Large-Scale Benchmark for Medical Question Answering in Romanian Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Question answering (QA) is an actively studied topic, being a core naturallanguage processing (NLP) task that needs to be addressed before achievingArtificial General Intelligence … |
ANA-CRISTINA ROGOZ et. al. | arxiv-cs.CL | 2025-08-22 |
| 521 | M3TQA: Massively Multilingual Multitask Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing multilingualtable benchmarks suffer from geolinguistic imbalance – overrepresenting certainlanguages and lacking sufficient scale for rigorous cross-lingual analysis. Toaddress these limitations, we introduce a comprehensive framework for massivelymultilingual multitask table question answering, featuring m3TQA-Instruct, alarge-scale benchmark spanning 97 languages across diverse language families,including underrepresented and low-resource languages. |
DAIXIN SHU et. al. | arxiv-cs.CL | 2025-08-22 |
| 522 | Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We further argue that knowledge acquisitionshould be progressive, mirroring human learning: first understanding concepts,then applying them to complex reasoning. To address this, we propose Selct2Know(S2K), a cost-effective framework that internalizes domain knowledge through aninternal-external knowledge self-selection strategy and selective supervisedfine-tuning. |
BOLEI HE et. al. | arxiv-cs.CL | 2025-08-20 |
| 523 | MedCoT-RAG: Causal Chain-of-Thought RAG for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MedCoT-RAG, a domain-specific frameworkthat combines causal-aware document retrieval with structured chain-of-thoughtprompting tailored to medical workflows. |
Ziyu Wang; Elahe Khatibi; Amir M. Rahmani; | arxiv-cs.CL | 2025-08-20 |
| 524 | Towards LLM-generated Explanations for Component-based Knowledge Graph Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on the explainability of component-basedsystems for Question Answering (QA). |
Dennis Schiese; Aleksandr Perevalov; Andreas Both; | arxiv-cs.SE | 2025-08-20 |
| 525 | Interactive Query Answering on Knowledge Graphs with Soft Entity Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In practice, many real-world queries involve constraintsthat are inherently vague or context-dependent, such as preferences forattributes or related categories. Addressing this gap, we introduce the problemof query answering with soft constraints. |
DANIEL DAZA et. al. | arxiv-cs.AI | 2025-08-19 |
| 526 | AdaDocVQA: Adaptive Framework for Long Document Visual Question Answering in Low-Resource Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Ablationstudies confirm meaningful contributions from each component, and our frameworkestablishes new state-of-the-art results for Japanese document VQA whileproviding a scalable foundation for other low-resource languages andspecialized domains. |
Haoxuan Li; Wei Song; Aofan Liu; Peiwu Qin; | arxiv-cs.CL | 2025-08-19 |
| 527 | Mitigating Easy Option Bias in Multiple-Choice Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this early study, we observe an Easy-Options Bias (EOB) issue in somemultiple-choice Visual Question Answering (VQA) benchmarks such as MMStar,RealWorldQA, SEED-Bench, Next-QA, STAR benchmark and Video-MME. |
Hao Zhang; Chen Li; Basura Fernando; | arxiv-cs.CV | 2025-08-18 |
| 528 | Consensus or Conflict? Fine-Grained Evaluation of Conflicting Answers in Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Models (LLMs) have demonstrated strong performance in questionanswering (QA) tasks. |
Eviatar Nachshoni; Arie Cattan; Shmuel Amar; Ori Shapira; Ido Dagan; | arxiv-cs.CL | 2025-08-17 |
| 529 | Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Natural language explanations in visual question answering (VQA-NLE) aim tomake black-box models more transparent by elucidating their decision-makingprocesses. |
Yahsin Yeh; Yilun Wu; Bokai Ruan; Honghan Shuai; | arxiv-cs.CV | 2025-08-17 |
| 530 | Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel RAG approach calledHGRAG for MHQA that achieves cross-granularity integration of structural andsemantic information via hypergraphs. |
Changjian Wang; Weihong Deng; Weili Guan; Quan Lu; Ning Jiang; | arxiv-cs.CL | 2025-08-15 |
| 531 | MobQA: A Benchmark Dataset for Semantic Understanding of Human Mobility Data Through Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents MobQA, a benchmark dataset designed to evaluate thesemantic understanding capabilities of large language models (LLMs) for humanmobility data through natural language question answering. |
Hikaru Asano; Hiroki Ouchi; Akira Kasuga; Ryo Yonetani; | arxiv-cs.CL | 2025-08-14 |
| 532 | Learning from Natural Language Feedback for Personalized Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce VAC, a novel framework forpersonalized response generation that replaces scalar rewards with naturallanguage feedback (NLF) that are generated conditioned on the user profiles andthe question narratives. |
Alireza Salemi; Hamed Zamani; | arxiv-cs.CL | 2025-08-14 |
| 533 | Medico 2025: Visual Question Answering for Gastrointestinal Imaging Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Medico 2025 challenge addresses Visual Question Answering (VQA) forGastrointestinal (GI) imaging, organized as part of the MediaEval task series.The challenge focuses on developing Explainable Artificial Intelligence (XAI)models that answer clinically relevant questions based on GI endoscopy imageswhile providing interpretable justifications aligned with medical reasoning. |
Sushant Gautam; Vajira Thambawita; Michael Riegler; Pål Halvorsen; Steven Hicks; | arxiv-cs.CV | 2025-08-14 |
| 534 | STRIDE-QA: Visual Question Answering Dataset for Spatiotemporal Reasoning in Urban Driving Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Vision-Language Models (VLMs) have been applied to autonomous driving tosupport decision-making in complex real-world scenarios. |
Keishi Ishihara; Kento Sasaki; Tsubasa Takahashi; Daiki Shiono; Yu Yamaguchi; | arxiv-cs.CV | 2025-08-14 |
| 535 | EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We hope EgoCross andour accompanying analysis will serve as a foundation for advancingdomain-adaptive, robust egocentric video understanding. |
YANJUN LI et. al. | arxiv-cs.CV | 2025-08-14 |
| 536 | Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel retrieval-augmented generation (RAG) framework tailoredfor multihop question answering. |
Seokgi Lee; | arxiv-cs.CL | 2025-08-13 |
| 537 | BERT-VQA: Visual Question Answering on Plots Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Visual question answering has been an exciting challenge in the field ofnatural language understanding, as it requires deep learning models to exchangeinformation from both vision and language domains. In this project, we aim totackle a subtask of this problem, namely visual question answering on plots. |
Tai Vu; Robert Yang; | arxiv-cs.LG | 2025-08-13 |
| 538 | RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Regulatory compliance question answering (QA) requires precise, verifiableinformation, and domain-specific expertise, posing challenges for LargeLanguage Models (LLMs). In this work, we present a novel multi-agent frameworkthat integrates a Knowledge Graph (KG) of Regulatory triplets withRetrieval-Augmented Generation (RAG) to address these demands. |
Bhavik Agarwal; Hemant Sunil Jomraj; Simone Kaplunov; Jack Krolick; Viktoria Rojkova; | arxiv-cs.AI | 2025-08-13 |
| 539 | LyS at SemEval 2025 Task 8: Zero-Shot Code Generation for Tabular QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes our participation in SemEval 2025 Task 8, focused onTabular Question Answering. |
Adrián Gude; Roi Santos-Ríos; Francisco Prado-Valiño; Ana Ezquerro; Jesús Vilares; | arxiv-cs.CL | 2025-08-12 |
| 540 | Capabilities of GPT-5 on Multimodal Medical Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A representative case study demonstrates GPT-5’s ability tointegrate visual and textual cues into a coherent diagnostic reasoning chain,recommending appropriate high-stakes interventions. |
Shansong Wang; Mingzhe Hu; Qiang Li; Mojtaba Safari; Xiaofeng Yang; | arxiv-cs.CL | 2025-08-11 |
| 541 | VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing benchmarks focus on English-onlydocument retrieval or only consider multilingual question-answering on asingle-page image. To bridge this gap, we introduce VisR-Bench, a multilingualbenchmark designed for question-driven multimodal retrieval in long documents.Our benchmark comprises over 35K high-quality QA pairs across 1.2K documents,enabling fine-grained evaluation of multimodal retrieval. |
JIAN CHEN et. al. | arxiv-cs.CV | 2025-08-10 |
| 542 | HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets Via Decision Pathways Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: HealthBranches is a novel benchmark dataset for medical Question-Answering(Q&A), specifically designed to evaluate complex reasoning in Large LanguageModels (LLMs). This dataset … |
CRISTIAN COSENTINO et. al. | arxiv-cs.CL | 2025-08-10 |
| 543 | ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, no known study tests the LLMs’ robustnesswhen presented with obfuscated versions of questions. To systematicallyevaluate these limitations, we propose a novel technique, ObfusQAte and,leveraging the same, introduce ObfusQA, a comprehensive, first of its kind,framework with multi-tiered obfuscation levels designed to examine LLMcapabilities across three distinct dimensions: (i) Named-Entity Indirection,(ii) Distractor Indirection, and (iii) Contextual Overload. |
Shubhra Ghosh; Abhilekh Borah; Aditya Kumar Guru; Kripabandhu Ghosh; | arxiv-cs.CL | 2025-08-10 |
| 544 | Two-Stage Quranic QA Via Ensemble Retrieval and Instruction-Tuned Answer Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inthis paper, we propose a novel two-stage framework that addresses both passageretrieval and answer extraction. |
Mohamed Basem; Islam Oshallah; Ali Hamdi; Khaled Shaban; Hozaifa Kassab; | arxiv-cs.CL | 2025-08-09 |
| 545 | BharatBBQ: A Multilingual Bias Benchmark for Question Answering in The Indian Context Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Evaluating social biases in language models (LMs) is crucial for ensuringfairness and minimizing the reinforcement of harmful stereotypes in AI systems.Existing benchmarks, such as the Bias Benchmark for Question Answering (BBQ),primarily focus on Western contexts, limiting their applicability to the Indiancontext. To address this gap, we introduce BharatBBQ, a culturally adaptedbenchmark designed to assess biases in Hindi, English, Marathi, Bengali, Tamil,Telugu, Odia, and Assamese. |
Aditya Tomar; Nihar Ranjan Sahoo; Pushpak Bhattacharyya; | arxiv-cs.CL | 2025-08-09 |
| 546 | Harnessing Adaptive Topology Representations for Zero-Shot Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Built on these, we develop the DynamicTRF framework, which aims toimprove both the accuracy and conciseness of graph QA. |
YANBIN WEI et. al. | arxiv-cs.CL | 2025-08-08 |
| 547 | QA-Dragon: Query-Aware Dynamic RAG System for Knowledge-Intensive Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thislimitation, we propose QA-Dragon, a Query-Aware Dynamic RAG System forKnowledge-Intensive VQA. |
Zhuohang Jiang; Pangjing Wu; Xu Yuan; Wenqi Fan; Qing Li; | arxiv-cs.AI | 2025-08-07 |
| 548 | Conformal P-Value in Multiple-Choice Question Answering Tasks with Provable Risk Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study introduces a significance testing-enhanced conformal prediction(CP) framework to improve trustworthiness of large language models (LLMs) inmultiple-choice question answering (MCQA). |
Yuanchang Ye; | arxiv-cs.CL | 2025-08-07 |
| 549 | Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble During Multi-Hop Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel, nuancederror categorization framework that examines failures across three criticaldimensions: the diversity and uniqueness of source documents involved (hops),completeness in capturing relevant information (coverage), and cognitiveinefficiency (overthinking). |
ANUSHKA YADAV et. al. | arxiv-cs.CL | 2025-08-06 |
| 550 | Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While recent approaches have explored text-based chain-of-thought(CoT) reasoning for MLLMs, these methods often suffer from limited cross-modalinteraction and increased hallucination, especially with longer videos orreasoning chains. To address these challenges, we propose Video Intelligencevia Tool-Augmented Learning (VITAL), a novel end-to-end agentic video reasoningframework. |
HAOJI ZHANG et. al. | arxiv-cs.CV | 2025-08-06 |
| 551 | CF-RAG: A Dataset and Method for Carbon Footprint QA Using Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we tacklethe challenge of answering questions related to carbon footprints withinsustainability reports available in PDF format. |
Kaiwen Zhao; Bharathan Balaji; Stephen Lee; | arxiv-cs.CL | 2025-08-05 |
| 552 | An Entity Linking Agent for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an entity linking agent for QA, based on a Large LanguageModel that simulates human cognitive workflows. |
YAJIE LUO et. al. | arxiv-cs.CL | 2025-08-05 |
| 553 | OpenLifelogQA: An Open-Ended Multi-Modal Lifelog Question-Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a novel lifelogQA dataset called OpenLifelogQA, building upon an 18-month lifelog dataset. |
Quang-Linh Tran; Binh Nguyen; Gareth J. F. Jones; Cathal Gurrin; | arxiv-cs.MM | 2025-08-05 |
| 554 | Domain-Specific Fine-Tuning and Prompt-Based Learning: A Comparative Study for Developing Natural Language-Based BIM Information Retrieval Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presentsa comparative analysis of two prominent approaches for developing NLI-based BIMinformation retrieval systems: domain-specific fine-tuning and prompt-basedlearning using large language models (LLMs). |
Han Gao; Timo Hartmann; Botao Zhong; Kai Lia; Hanbin Luo; | arxiv-cs.IR | 2025-08-05 |
| 555 | A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a multi-agent system (MAS) designed to supportcomplex reasoning in RVQA, with specialized agents for context understanding,multimodal reasoning, and answer validation. |
Ziruo Yi; Jinyu Liu; Ting Xiao; Mark V. Albert; | arxiv-cs.AI | 2025-08-04 |
| 556 | Evaluating Variance in Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite their advancements,the evaluation of MLLMs on VQA benchmarks often relies on point estimates,overlooking the significant variance in performance caused by factors such asstochastic model outputs, training seed sensitivity, and hyperparameterconfigurations. This paper critically examines these issues by analyzingvariance across 14 widely used VQA benchmarks, covering diverse tasks such asvisual reasoning, text understanding, and commonsense reasoning. |
Nikitha SR; | arxiv-cs.CV | 2025-08-04 |
| 557 | SustainableQA: A Comprehensive Question Answering Dataset for Corporate Sustainability and EU Taxonomy Reporting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The growing demand for corporate sustainability transparency, particularlyunder new regulations like the EU Taxonomy, necessitates precise dataextraction from large, unstructured corporate reports, a task for which LargeLanguage Models and Retrieval-RAG systems require high-quality, domain-specificquestion-answering datasets. To address this, we introduce SustainableQA, anovel dataset and a scalable pipeline that generates comprehensive QA pairsfrom corporate sustainability and annual reports by integrating semantic chunkclassification, a hybrid span extraction pipeline, and a specializedtable-to-paragraph transformation. |
Mohammed Ali; Abdelrahman Abdallah; Adam Jatowt; | arxiv-cs.IR | 2025-08-04 |
| 558 | Contextually Aware E-Commerce Product Question Answering Using RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing Product Question Answering (PQA) systems oftenfail to utilize rich user context and diverse product information effectively.We propose a scalable, end-to-end framework for e-commerce PQA using RetrievalAugmented Generation (RAG) that deeply integrates contextual understanding. |
Praveen Tangarajan; Anand A. Rajasekar; Manish Rathi; Vinay Rao Dandin; Ozan Ersoy; | arxiv-cs.CL | 2025-08-03 |
| 559 | Harnessing Collective Intelligence of LLMs for Robust Biomedical QA: A Multi-Model Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present our participation in the 13thedition of the BioASQ challenge, which involves biomedical semanticquestion-answering for Task 13b and biomedical question-answering fordeveloping topics for the Synergy task. |
Dimitra Panou; Alexandros C. Dimopoulos; Manolis Koubarakis; Martin Reczko; | arxiv-cs.CL | 2025-08-02 |
| 560 | D-SCoRE: Document-Centric Segmentation and CoT Reasoning with Structured Export for QA-CoT Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The scarcity and high cost of high-quality question-answering (QA) datasetshinder supervised fine-tuning (SFT) for domain-specific large language models(LLMs). To address this, we introduce D-SCoRE, a training-free pipeline thatutilizes LLMs and prompt engineering to produce diverse, high-quality QAdatasets from arbitrary textual sources. |
Weibo Zhou; Lingbo Li; Shangsong Liang; | arxiv-cs.CL | 2025-08-02 |
| 561 | Prompting Large Language Models with Partial Knowledge for Answering Questions with Unseen Entities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Contrary to theconventional view, we propose a new perspective: LLMs can be awakened viapartially relevant knowledge already embedded in LLMs. |
ZHICHAO YAN et. al. | arxiv-cs.CL | 2025-08-02 |
| 562 | MHier-RAG: Multi-Modal RAG for Visual-Rich Document Question-Answering Via Hierarchical and Multi-Granularity Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the former were susceptible to hallucinations,while the latter struggled for inter-modal disconnection and cross-pagefragmentation. To address these challenges, a novel multi-modal RAG model,named MHier-RAG, was proposed, leveraging both textual and visual informationacross long-range pages to facilitate accurate question answering forvisual-rich documents. |
Ziyu Gong; Chengcheng Mai; Yihua Huang; | arxiv-cs.MM | 2025-08-01 |
| 563 | ExVQA: A Novel Stacked Attention Networks with Extended Long Short-term Memory Model for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Bui Thanh Hung; Duy Ho Vo Hoang; | Comput. Electr. Eng. | 2025-08-01 |
| 564 | Demo: TOSense — What Did You Just Agree To? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To avoid expensive manualannotation, we present a novel Question Answering Evaluation Pipeline (QEP)that generates synthetic questions and verifies the correctness of answersusing clustered topic matching. |
Xinzhang Chen; Hassan Ali; Arash Shaghaghi; Salil S. Kanhere; Sanjay Jha; | arxiv-cs.CR | 2025-08-01 |
| 565 | Ifqa-llm: Intelligent Intention-driven Financial Question-answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Fangshu Chen; Yilin Huang; Jiahui Wang; Chengcheng Yu; Xiankai Meng; | The Journal of Supercomputing | 2025-08-01 |
| 566 | Agentic Large Language Models Improve Retrieval-based Radiology Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here we propose radiology Retrievaland Reasoning (RaR), a multi-step retrieval and reasoning framework designed toimprove diagnostic accuracy, factual consistency, and clinical reliability ofLLMs in radiology question answering. |
SEBASTIAN WIND et. al. | arxiv-cs.CL | 2025-08-01 |
| 567 | ITUNLP at SemEval-2025 Task 8: Question-Answering Over Tabular Data: A Zero-Shot Approach Using LLM-Driven Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents our system for SemEval-2025 Task 8: DataBench,Question-Answering over Tabular Data. |
Atakan Site; Emre Hakan Erdemir; Gülşen Eryiğit; | arxiv-cs.CL | 2025-08-01 |
| 568 | Cascaded Information Disclosure for Generalized Evaluation of Problem Solving Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While question-answering~(QA) benchmark performance is an automatic andscalable method to compare LLMs, it is an indirect method of evaluating theirunderlying problem-solving capabilities. Therefore, we propose a holistic andgeneralizable framework based on \emph{cascaded question disclosure} thatprovides a more accurate estimate of the models’ problem-solving capabilitieswhile maintaining the scalability and automation. |
Yunxiang Yan; Tomohiro Sawada; Kartik Goyal; | arxiv-cs.CL | 2025-07-31 |
| 569 | A Benchmark Dataset and Evaluation Framework for Vietnamese Large Language Models in Customer Support Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address thisgap, we introduce the Customer Support Conversations Dataset (CSConDa), acurated benchmark of over 9,000 QA pairs drawn from real interactions withhuman advisors at a large Vietnamese software company. |
LONG S. T. NGUYEN et. al. | arxiv-cs.CL | 2025-07-30 |
| 570 | CUS-QA: Local-Knowledge-Oriented Open-Ended Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce CUS-QA, a benchmark for open-ended regional question answeringthat encompasses both textual and visual modalities. |
Jindřich Libovický; Jindřich Helcl; Andrei Manea; Gianluca Vico; | arxiv-cs.CL | 2025-07-30 |
| 571 | Exploring The Application of Visual Question Answering (VQA) for Classroom Activity Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate the applicability of severalstate-of-the-art open-source VQA models, including LLaMA2, LLaMA3, QWEN3, andNVILA, in the context of classroom behavior analysis. |
SINH TRONG VU et. al. | arxiv-cs.CV | 2025-07-30 |
| 572 | Solution for Meta KDD Cup’25: A Comprehensive Three-Step Framework for Vision Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thispaper describes the solutions of all tasks in Meta KDD Cup’25 from BlackPearlteam. |
Zijian Zhang; Xiaocheng Zhang; Yang Zhou; Zhimin Lin; Peng Yan; | arxiv-cs.IR | 2025-07-29 |
| 573 | Knowledge Editing for Multi-Hop Question Answering Using Semantic Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a knowledge editor for MQA based onsemantic analysis called CHECK. |
Dominic Simon; Rickard Ewetz; | arxiv-cs.AI | 2025-07-29 |
| 574 | Analyzing The Sensitivity of Vision Language Models in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we explore the sensitivity of VisionLanguage Models (VLMs) through the lens of cooperative principles ofconversation proposed by Grice. |
Monika Shah; Sudarshan Balaji; Somdeb Sarkhel; Sanorita Dey; Deepak Venugopal; | arxiv-cs.CV | 2025-07-28 |
| 575 | Shapley Uncertainty in Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: It primarily relies on setting threshold tomeasure the level of semantic equivalence relation. We propose a more nuancedframework that extends beyond such thresholding by developing a Shapley-baseduncertainty metric that captures the continuous nature of semanticrelationships. |
Meilin Zhu; Gaojie Jin; Xiaowei Huang; Lijun Zhang; | arxiv-cs.AI | 2025-07-28 |
| 576 | Hajj-FQA: A Benchmark Arabic Dataset for Developing Question-answering Systems on Hajj Fatwas Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Hayfa A. Aleid; Aqil M. Azmi; | Journal of King Saud University Computer and Information … | 2025-07-25 |
| 577 | RoD-TAL: A Benchmark for Answering Questions in Romanian Driving License Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The intersection of AI and legal systems presents a growing need for toolsthat support legal education, particularly in under-resourced languages such asRomanian. In this work, we aim to evaluate the capabilities of Large LanguageModels (LLMs) and Vision-Language Models (VLMs) in understanding and reasoningabout Romanian driving law through textual and visual question-answering tasks.To facilitate this, we introduce RoD-TAL, a novel multimodal dataset comprisingRomanian driving test questions, text-based and image-based, alongsideannotated legal references and human explanations. |
ANDREI VLAD MAN et. al. | arxiv-cs.CL | 2025-07-25 |
| 578 | A Graph-based Approach for Multi-Modal Question Answering from Flowcharts in Telecom Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the end-to-end approach from processing technical documents,classifying image types, building graph representations, and incorporating themwith the text embedding pipeline for efficient retrieval. |
SUMIT SOMAN et. al. | arxiv-cs.CL | 2025-07-25 |
| 579 | PDB-Eval: An Evaluation of Large Multimodal Models for Description and Explanation of Personalized Driving Behavior Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a benchmark, PDB-Eval, for a detailedunderstanding of Personalized Driver Behavior, and aligning Large MultimodalModels (MLLMs) with driving comprehension and reasoning. |
JUNDA WU et. al. | arxiv-cs.CV | 2025-07-24 |
| 580 | None of The Above: Comparing Scenarios for Answerability Detection in Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Julio Reyes-Montesinos; Álvaro Rodrigo; Anselmo Peńas; | Applied Intelligence | 2025-07-23 |
| 581 | TyDi QA-WANA: A Benchmark for Information-Seeking Question Answering in Languages of West Asia and North Africa Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present TyDi QA-WANA, a question-answering dataset consisting of 28Kexamples divided among 10 language varieties of western Asia and northernAfrica. |
Parker Riley; Siamak Shakeri; Waleed Ammar; Jonathan H. Clark; | arxiv-cs.CL | 2025-07-23 |
| 582 | Leveraging Synthetic Data for Question Answering with Multilingual LLMs in The Agricultural Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Publicly available general-purpose Large Language Models(LLMs) typically offer generic agriculture advisories, lacking precision inlocal and multilingual contexts. |
RISHEMJIT KAUR et. al. | arxiv-cs.CL | 2025-07-22 |
| 583 | GG-BBQ: German Gender Bias Benchmark for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In our work, we evaluate gender bias in German LargeLanguage Models (LLMs) using the Bias Benchmark for Question Answering byParrish et al. (2022) as a reference. |
SHALAKA SATHEESH et. al. | arxiv-cs.CL | 2025-07-22 |
| 584 | REVISE: A Framework for Revising OCRed Text in Practical Information Systems with Data Contamination Strategy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches primarily focus on solving specific tasks, lacking the capability to structurally organize and systematically manage document information. To address this limitation, we propose Revise, a framework that systematically corrects errors introduced by OCR at the character, word, and structural levels. |
Gyuho Shim; Seongtae Hong; Heuiseok Lim; | acl | 2025-07-21 |
| 585 | ComRAG: Retrieval-Augmented Generation with Dynamic Vector Stores for Real-time Community Question Answering in Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose ComRAG, a retrieval-augmented generation framework for real-time industrial CQA that integrates static knowledge with dynamic historical QA pairs via a centroid-based memory mechanism designed for retrieval, generation, and efficient storage. |
QINWEN CHEN et. al. | acl | 2025-07-21 |
| 586 | Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering (Q-DREAM). |
LINHAO YE et. al. | acl | 2025-07-21 |
| 587 | EXPLAIN: Enhancing Retrieval-Augmented Generation with Entity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce EXPLAIN (EXtracting, Pre-summarizing, Linking and enhAcINg RAG), a novel retrieval-augmented generation method that automatically extracts useful entities and generates summaries from documents. |
YAOZHEN LIANG et. al. | acl | 2025-07-21 |
| 588 | QAEncoder: Towards Aligned Representation Learning in Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the inherent gap between user queries and relevant documents hinders precise matching. We introduce QAEncoder, a training-free approach to bridge this gap. |
ZHENGREN WANG et. al. | acl | 2025-07-21 |
| 589 | YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce YESciEval, an open-source framework that combines fine-grained rubric-based assessment with reinforcement learning to mitigate optimism bias in LLM evaluators. |
Jennifer D’Souza; Hamed Babaei Giglou; Quentin Münch; | acl | 2025-07-21 |
| 590 | Grounded, or A Good Guesser? A Per-Question Balanced Dataset to Separate Blind from Grounded Models for Embodied Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, previous work has demonstrated that blind language models (which do not incorporate perception, but predict an answer based solely on the question text) are a strong baseline for existing benchmarks, even compared against state-of-the-art vision and language models. To determine whether a model is grounding its answers in its specific environment, rather than relying on a language model’s expectations about the world generally, we propose PQB-EQA, a *per-question balanced* EQA dataset. |
MILES SHELTON et. al. | acl | 2025-07-21 |
| 591 | NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose NeuSym-RAG, a hybrid neural symbolic retrieval framework which combines both paradigms in an interactive process. |
RUISHENG CAO et. al. | acl | 2025-07-21 |
| 592 | MRAKL: Multilingual Retrieval-Augmented Knowledge Graph Construction for Low-Resourced Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we reformulate themKGC task as a Question Answering (QA) task and introduce mRAKL: aRetrieval-Augmented Generation (RAG) based system to perform mKGC. |
Hellina Hailu Nigatu; Min Li; Maartje ter Hoeve; Saloni Potdar; Sarah Chasins; | arxiv-cs.CL | 2025-07-21 |
| 593 | CSTree-SRI: Introspection-Driven Cognitive Semantic Tree for Multi-Turn Question Answering Over Extra-Long Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the challenges, we propose the CSTree-SRI framework (Cognitive Semantic Tree through Summarization, Retrieval, and Introspection). |
ZHAOWEN WANG et. al. | acl | 2025-07-21 |
| 594 | BELLE: A Bi-Level Multi-Agent Reasoning Framework for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we propose a Bi-levEL muLti-agEnt reasoning (BELLE) framework to address multi-hop QA by specifically focusing on the correspondence between question types and methods, where each type of method is regarded as an ”operator” by prompting LLMs differently. |
Taolin Zhang; Dongyang Li; Qizhou Chen; Chengyu Wang; Xiaofeng He; | acl | 2025-07-21 |
| 595 | Exploiting The Shadows: Unveiling Privacy Leaks Through Lower-Ranked Tokens in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel attack method by exploiting the model’s lower-ranked output tokens to leak sensitive information. |
Yuan Zhou; Zhuo Zhang; Xiangyu Zhang; | acl | 2025-07-21 |
| 596 | NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the other hand, while large language models (LLMs), a popular solution for this task, demonstrate strong reasoning abilities, they struggle with the domain-specific complexities of personalized healthy dietary reasoning, and existing benchmarks fail to capture these challenges. To address these gaps, we introduce the Nutritional Graph Question Answering (NGQA) benchmark, the first graph question answering dataset designed for personalized nutritional health reasoning. |
ZHEYUAN ZHANG et. al. | acl | 2025-07-21 |
| 597 | On Synthesizing Data for Context Attribution in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Providing this information is the task of context attribution. In this paper, we systematically study LLM-based approaches for this task, namely we investigate (i) zero-shot inference, (ii) LLM ensembling, and (iii) fine-tuning of small LMs on synthetic data generated by larger LLMs. |
GORJAN RADEVSKI et. al. | acl | 2025-07-21 |
| 598 | Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current benchmarks overlook these hidden reasoning attributes, making it difficult to assess a model’s specific levels of commonsense knowledge and reasoning ability. To address this issue, we introduce ReComSBench, a novel framework that reveals hidden reasoning attributes behind commonsense questions by leveraging the knowledge generated during the reasoning process. |
Huijun Lian; Zekai Sun; Keqi Chen; Yingming Gao; Ya Li; | acl | 2025-07-21 |
| 599 | Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we explore how LMs respond to multi-hop questions by permuting search results (retrieved documents) under various configurations. |
Wenyu Huang; Pavlos Vougiouklis; Mirella Lapata; Jeff Z. Pan; | acl | 2025-07-21 |
| 600 | Ontology-Guided Reverse Thinking Makes Large Language Models Stronger on Knowledge Graph Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a result, it is difficult to establish reasoning paths to the purpose, which leads to information loss and redundancy. To address this issue, inspired by human reverse thinking, we propose Ontology-Guided Reverse Thinking (ORT), a novel framework that constructs reasoning paths from purposes back to conditions. |
RUNXUAN LIU et. al. | acl | 2025-07-21 |
| 601 | CaLMQA: Exploring Culturally Specific Long-form Question Answering Across 23 Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We define culturally specific questions as those that refer to concepts unique to one or a few cultures, or have different answers depending on the cultural or regional context. We obtain these questions by crawling naturally-occurring questions from community web forums in high-resource languages, and by hiring native speakers to write questions in under-resourced, rarely-studied languages such as Fijian and Kirundi. |
SHANE ARORA et. al. | acl | 2025-07-21 |
| 602 | Reasoning Models Are Test Exploiters: Rethinking Multiple-Choice Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For eachmodel–benchmark pair, we considered 5 ways of presenting the model withquestions, including variations on whether multiple choices were offered to themodel at all; whether none of the above sometimes replaced the right answer;and whether the model was permitted to perform chain-of-thought reasoningbefore and/or after the choices were presented. |
Narun Raman; Taylor Lundy; Kevin Leyton-Brown; | arxiv-cs.CL | 2025-07-21 |
| 603 | Beyond The Answer: Advancing Multi-Hop QA with Fine-Grained Graph Reasoning and Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We argue that the reasoning process should also be evaluated because wrong reasoning process can also lead to the correct final answers. Motivated by this, we propose a “Planner-Executor-Reasoner” (PER) architecture, which forms the core of the Plan-anchored Data Preprocessing (PER-DP) and the Plan-guided Multi-Hop QA (PER-QA). |
QICHUAN LIU et. al. | acl | 2025-07-21 |
| 604 | Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing methods and datasets remain focused on a narrow spectrum of tasks, such as forecasting or anomaly detection. To bridge this gap, we introduce Time Series Multi-Task Question Answering (Time-MQA), a unified framework that enables natural language queries across multiple time series tasks – numerical analytical tasks and open-ended question answering with reasoning. |
YAXUAN KONG et. al. | acl | 2025-07-21 |
| 605 | DTCRS: Dynamic Tree Construction for Recursive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DTCRS, a method that dynamically generates summary trees based on document structure and query semantics. |
Guanran Luo; Zhongquan Jian; Wentao Qiu; Meihong Wang; Qingqiang Wu; | acl | 2025-07-21 |
| 606 | ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Dense retrievers typically outperform sparse methods like BM25 by leveraging semantic embeddings in many tasks; however, they require labeled query-document pairs for fine-tuning, which poses a significant challenge in MHQA due to the complexity of the reasoning steps. To overcome this limitation, we introduce Retriever Supervision with Consistency and Relevance (ReSCORE), a novel method for training dense retrievers for MHQA without the need for labeled documents. |
DOSUNG LEE et. al. | acl | 2025-07-21 |
| 607 | AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce AfriMed-QA , the first largescale Pan-African English multi-specialty medical Question-Answering (QA) dataset, 15,000 questions (open and closed-ended) sourced from over 60 medical schools across 16 countries, covering 32 medical specialties. |
CHARLES NIMO et. al. | acl | 2025-07-21 |
| 608 | Learning Sparsity for Effective and Efficient Music Performance Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing Music AVQA methods often rely on dense and unoptimized representations, leading to inefficiencies in the isolation of key information, the reduction of redundancy, and the prioritization of critical samples. To address these challenges, we introduce Sparsify, a sparse learning framework specifically designed for Music AVQA. |
XINGJIAN DIAO et. al. | acl | 2025-07-21 |
| 609 | Can LLMs Evaluate Complex Attribution in QA? Automatic Benchmarking Using Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Attributed Question Answering (AQA) has attracted wide attention, but there are still several limitations in evaluating the attributions, including lacking fine-grained attribution categories, relying on manual annotations, and failing to compare attributions with only subtle differences. To bridge these gaps, we introduce Complex Attributed Question Answering (CAQA), a large-scale benchmark containing comprehensive attribution categories, automatically generated using Knowledge Graphs (KGs), and complex attribution scenarios. |
NAN HU et. al. | acl | 2025-07-21 |
| 610 | Micro-Act: Mitigate Knowledge Conflict in Question Answering Via Actionable Self-Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing approaches often attempt to mitigate conflicts by directly comparing two knowledge sources in a side-by-side manner, but this can overwhelm LLMs with extraneous or lengthy contexts, ultimately hindering their ability to identify and mitigate inconsistencies. To address this issue, we propose **Micro-Act** a framework with a hierarchical action space that automatically perceives context complexity and adaptively decomposes each knowledge source into a sequence of fine-grained comparisons. |
NAN HUO et. al. | acl | 2025-07-21 |
| 611 | Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Besides, RAG is not always needed as may introduce irrelevant information. Recent adaptive retrieval methods integrate LLMs’ intrinsic knowledge with external information appealing to LLM self-knowledge, but they often neglect efficiency evaluations and comparisons with uncertainty estimation techniques. |
VIKTOR MOSKVORETSKII et. al. | acl | 2025-07-21 |
| 612 | Not All Terms Matter: Recall-Oriented Adaptive Learning for PLM-aided Query Expansion in Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, we propose a novel Recall-oriented Adaptive Learning (ReAL) method, which iteratively adjusts the importance weights of QE terms based on their relevance, thereby refining term distinction and enhancing the separation of relevant terms. |
Xinran Chen; Ben He; Xuanang Chen; Le Sun; | acl | 2025-07-21 |
| 613 | Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we identify a critical problem, “lost-in-retrieval”, in retrieval-augmented multi-hop question answering (QA): the key entities are missed in LLMs’ sub-question decomposition. |
Rongzhi Zhu; Xiangyu Liu; Zequn Sun; Yiwei Wang; Wei Hu; | acl | 2025-07-21 |
| 614 | QAEval: Mixture of Evaluators for Question-Answering Task Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: LLM-based evaluation methods offer greater flexibility but suffer from sensitivity to instructions, robustness issues, and high computational costs. To overcome these challenges, we introduce QAEval, a hybrid framework combining rule-based reliability with LLM-based adaptability. |
TAN YUE et. al. | acl | 2025-07-21 |
| 615 | Doc-React: Multi-page Heterogeneous Document Question-answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by iterative frameworks like ReAct, which refine retrieval through feedback, we propose Doc-React, an adaptive iterative framework that balances information gain and uncertainty reduction at each step. |
JUNDA WU et. al. | acl | 2025-07-21 |
| 616 | How to Compare Things Properly? A Study of Argument Relevance in Comparative Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: It poses unique challenges due to the inherently subjective nature of many questions and the need to integrate diverse perspectives. |
IRINA NIKISHINA et. al. | acl | 2025-07-21 |
| 617 | Can Large Language Models Accurately Generate Answer Keys for Health-related Questions? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore several approaches to nugget generation for medical question answering and evaluate their alignment with expert human nugget generation. |
Davis Bartels; Deepak Gupta; Dina Demner-Fushman; | acl | 2025-07-21 |
| 618 | A Multi-Agent Framework for Mitigating Dialect Biases in Privacy Policy Question-Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel multi-agent framework inspired by human-centered design principles to mitigate dialectal biases. |
ĐORĐE KLISURA et. al. | acl | 2025-07-21 |
| 619 | InterAct-Video: Reasoning-Rich Video QA for Urban Traffic Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However,existing VideoQA models struggle with the complexity of real-world trafficscenes, where multiple concurrent events unfold across spatiotemporaldimensions. To address these challenges, this paper introduces \textbf{InterActVideoQA}, a curated dataset designed to benchmark and enhance VideoQA modelsfor traffic monitoring tasks. |
JOSEPH RAJ VISHAL et. al. | arxiv-cs.CV | 2025-07-19 |
| 620 | LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While recent advances in multimodal learninghave improved alignment and fusion, current approaches remain limited by twoprevalent but fundamentally flawed strategies: (1) task-agnostic samplingindiscriminately processes all frames, overwhelming key events with irrelevantcontent; and (2) heuristic retrieval captures superficial patterns but missescausal-temporal structures needed for complex reasoning. To address thesechallenges, we introduce LeAdQA, an innovative approach that bridges these gapsthrough synergizing causal-aware query refinement with fine-grained visualgrounding. |
XINXIN DONG et. al. | arxiv-cs.CV | 2025-07-19 |
| 621 | Team of One: Cracking Complex Video QA with Model Synergy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel framework for open-ended video question answering thatenhances reasoning depth and robustness in complex real-world scenarios, asbenchmarked on the CVRR-ES dataset. |
JUN XIE et. al. | arxiv-cs.CV | 2025-07-18 |
| 622 | SPARQL Query Generation with LLMs: Measuring The Impact of Training Data Memorization and Knowledge Injection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel method thatevaluates the quality of LLMs by generating a SPARQL query from anatural-language question under various conditions: (1) zero-shot SPARQLgeneration, (2) with knowledge injection, and (3) with anonymized knowledgeinjection. |
Aleksandr Gashkov; Aleksandr Perevalov; Maria Eltsova; Andreas Both; | arxiv-cs.IR | 2025-07-18 |
| 623 | Teaching Vision-Language Models to Ask: Resolving Ambiguity in Visual Questions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To overcome thesechallenges, we introduce \textbf{ClearVQA} benchmark, which targets threecommon categories of ambiguity in VQA context, and encompasses various VQAscenarios. |
Pu Jian; Donglei Yu; Wen Yang; Shuo Ren; Jiajun Zhang; | arxiv-cs.CV | 2025-07-18 |
| 624 | COREVQA: A Crowd Observation and Reasoning Entailment Visual Question Answering Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To addressthis, we propose COREVQA (Crowd Observations and Reasoning Entailment), abenchmark of 5608 image and synthetically generated true/false statement pairs,with images derived from the CrowdHuman dataset, to provoke visual entailmentreasoning on challenging crowded images. |
ISHANT CHINTAPATLA et. al. | arxiv-cs.CV | 2025-07-17 |
| 625 | FIQ: Fundamental Question Generation with The Integration of Question Embeddings for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a fundamental questiongeneration with the integration of question embeddings for video questionanswering (FIQ), a novel approach designed to strengthen the reasoning abilityof the model by enhancing the fundamental understanding of videos. |
Ju-Young Oh; Ho-Joong Kim; Seong-Whan Lee; | arxiv-cs.CV | 2025-07-17 |
| 626 | POLYCHARTQA: Benchmarking Large Vision-Language Models with Multilingual Chart Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present PolyChartQA, the first large-scalemultilingual chart question answering benchmark covering 22,606 charts and26,151 question-answering pairs across 10 diverse languages. |
Yichen Xu; Liangyu Chen; Liang Zhang; Wenxuan Wang; Qin Jin; | arxiv-cs.CL | 2025-07-16 |
| 627 | Describe Anything Model for Visual Question Answering on Text-rich Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In such settings, thefine-grained extraction of textual information is crucial to producing correctanswers. Motivated by this, we introduce DAM-QA, a framework with a tailoredevaluation protocol, developed to investigate and harness the region-awarecapabilities from DAM for the text-rich VQA problem that requires reasoningover text-based information within images. |
YEN-LINH VU et. al. | arxiv-cs.CV | 2025-07-16 |
| 628 | The Benefits of Query-based KGQA Systems for Complex and Temporal Questions in LLM Era Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We explore multi-stage query-based framework for WikiData QA,proposing multi-stage approach that enhances performance on challengingmulti-hop and temporal benchmarks. |
ARTEM ALEKSEEV et. al. | arxiv-cs.CL | 2025-07-16 |
| 629 | 3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With the growing need for diverse and scalable data in indoor scene tasks,such as question answering and dense captioning, we propose 3D-MoRe, a novelparadigm designed to generate large-scale 3D-language datasets by leveragingthe strengths of foundational models. |
RONGTAO XU et. al. | arxiv-cs.CV | 2025-07-16 |
| 630 | MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate multipleMLLMs using six visual analytical tasks, comparing their performance againstone another and a human baseline. |
Varun Srivastava; Fan Lei; Srija Mukhopadhyay; Vivek Gupta; Ross Maciejewski; | arxiv-cs.CL | 2025-07-15 |
| 631 | EsBBQ and CaBBQ: The Spanish and Catalan Bias Benchmarks for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given the notablelack of resources for social bias evaluation in languages other than English,and for social contexts outside of the United States, this paper introduces theSpanish and the Catalan Bias Benchmarks for Question Answering (EsBBQ andCaBBQ). |
VALLE RUIZ-FERNÁNDEZ et. al. | arxiv-cs.CL | 2025-07-15 |
| 632 | ExpliCIT-QA: Explainable Code-Based Image Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ExpliCIT-QA, a system that extends our previous MRT approach fortabular question answering into a multimodal pipeline capable of handlingcomplex table images and providing explainable answers. |
Maximiliano Hormazábal Lagos; Álvaro Bueno Sáez; Pedro Alonso Doval; Jorge Alcalde Vesteiro; Héctor Cerezo-Costas; | arxiv-cs.CL | 2025-07-15 |
| 633 | Warehouse Spatial Question Answering with LLM Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, wepresent a data-efficient approach. |
HSIANG-WEI HUANG et. al. | arxiv-cs.CV | 2025-07-14 |
| 634 | CG-RAG: Research Question Answering By Citation Graph Retrieval-Augmented LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Contextualized Graph Retrieval-Augmented Generation (CG-RAG), a novel framework that integrates sparse and dense retrieval signals within graph structures to enhance retrieval efficiency and subsequently improve generation quality for research question answering. |
YUNTONG HU et. al. | sigir | 2025-07-13 |
| 635 | Understanding Large Language Model Performance in Software Engineering: A Large-scale Question Answering Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce CodeRepoQA, a large-scale benchmark specifically designed for evaluating repository-level question-answering capabilities in the field of software engineering. |
RUIDA HU et. al. | sigir | 2025-07-13 |
| 636 | Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for Deep Research Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Researchy Questions, the world’s first, only and largest public dataset of ”Deep Research” questions filtered from real search engine logs to be non-factoid, ”decompositional” and multi-perspective. |
CORBIN ROSSET et. al. | sigir | 2025-07-13 |
| 637 | WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present WebFAQ, a large-scale collection of open-domain question answering datasets derived from FAQ-style schema.org annotations. |
Michael Dinzinger; Laura Caspari; Kanishka Ghosh Dastidar; Jelena Mitrovi\'{c}; Michael Granitzer; | sigir | 2025-07-13 |
| 638 | Wrong Answers Can Also Be Useful: PlausibleQA – A Large-Scale QA Dataset with Answer Plausibility Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing QA datasets primarily focus on correct answers without explicit consideration of the plausibility of other candidate answers, limiting opportunity for more nuanced evaluations of models. To address this gap, we introduce PlausibleQA, a large-scale dataset comprising 10,000 questions and 100,000 candidate answers, each annotated with plausibility scores and justifications for their selection. |
Jamshid Mozafari; Abdelrahman Abdallah; Bhawna Piryani; Adam Jatowt; | sigir | 2025-07-13 |
| 639 | ClusterChat: Multi-Feature Search for Corpus Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ClusterChat. The demo video and source code are available at: https://github.com/achouhan93/ClusterChat, an open-source system for corpus exploration that integrates cluster-based organization of documents using textual embeddings with lexical and semantic search, timeline-driven exploration, and corpus and document-level question answering (QA) as multi-feature search capabilities. |
Ashish Chouhan; Saifeldin Mandour; Michael Gertz; | sigir | 2025-07-13 |
| 640 | Dynamic-KGQA: A Scalable Framework for Generating Adaptive Question Answering Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce Dynamic-KGQA, a scalable framework for generating adaptive QA datasets from knowledge graphs (KGs), designed to mitigate memorization risks while maintaining statistical consistency across iterations. |
Preetam Prabhu Srikar Dammu; Himanshu Naidu; Chirag Shah; | sigir | 2025-07-13 |
| 641 | PILs of Knowledge: A Synthetic Benchmark for Evaluating Question Answering Systems in Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, no dedicated benchmark currently exists to evaluate QA systems specifically on PILs, limiting progress in this domain. To address this gap, we introduce a fact-supported synthetic benchmark composed of multiple-choice questions and answers generated from real PILs. |
RICCARDO LUNARDI et. al. | sigir | 2025-07-13 |
| 642 | Graph-Based Multimodal Contrastive Learning for Chart Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces a novel joint multimodal scene graph framework that explicitly models the relationships among chart components and their underlying structures. |
Yue Dai; Soyeon Caren Han; Wei Liu; | sigir | 2025-07-13 |
| 643 | Question-Answering Dense Video Events Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: For improvement, we propose DeVi, a novel training-free MLLM approach that highlights a hierarchical captioning module, a temporal event memory module, and a self-consistency checking module to respectively detect, contextualize and memorize, and ground dense-events in long videos for question answering. |
Hangyu Qin; Junbin Xiao; Angela Yao; | sigir | 2025-07-13 |
| 644 | NLQxform-UI: An Interactive and Intuitive Scholarly Question Answering System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop an interactive and intuitive scholarly question answering system called NLQxform-UI, which allows users to pose complex queries in the form of natural language questions. |
Ruijie Wang; Zhiruo Zhang; Luca Rossetto; Florian Ruosch; Abraham Bernstein; | sigir | 2025-07-13 |
| 645 | An Empirical Study of Evaluating Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We collect 5,236 factoid and non-factoid long-form answers generated by different large language models and conduct a human evaluation on 2,079 of them, focusing on correctness and informativeness. |
Ning Xian; Yixing Fan; Ruqing Zhang; Maarten de Rijke; Jiafeng Guo; | sigir | 2025-07-13 |
| 646 | Evaluating LLMs’ (In)ability to Follow Prompts in QA Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address our research question, we propose Oedipus, an evaluation framework to evaluate LLMs’ ability to follow prompts. |
Aparup Khatua; Tobias Kalmbach; Prasenjit Mitra; Sandipan Sikdar; | sigir | 2025-07-13 |
| 647 | Towards Spatial Audio Understanding Via Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel framework for spatial audio understandingof first-order ambisonic (FOA) signals through a question answering (QA)paradigm, aiming to extend the scope of sound event localization and detection(SELD) towards spatial scene understanding and reasoning. |
Parthasaarathy Sudarsanam; Archontis Politis; | arxiv-cs.SD | 2025-07-12 |
| 648 | What Factors Affect LLMs and RLLMs in Financial Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To investigate the impact of variousmethods on LLMs and RLLMs, we utilize five LLMs and three RLLMs to assess theeffects of prompting methods, agentic frameworks, and multilingual alignmentmethods on financial question-answering tasks. |
PENG WANG et. al. | arxiv-cs.CL | 2025-07-11 |
| 649 | Exploring The Limits of Model Compression in LLMs: A Knowledge Distillation Study on QA Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We evaluate student models distilled from the Pythia and Qwen2.5families on two QA benchmarks, SQuAD and MLQA, under zero-shot and one-shotprompting conditions. |
Joyeeta Datta; Niclas Doll; Qusai Ramadan; Zeyd Boukhers; | arxiv-cs.CL | 2025-07-10 |
| 650 | FrugalRAG: Learning to Retrieve and Reason for Multi-hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we showthat: (1) Large-scale fine-tuning is not needed to improve RAG metrics,contrary to popular claims in recent literature. |
Abhinav Java; Srivathsan Koundinyan; Nagarajan Natarajan; Amit Sharma; | arxiv-cs.CL | 2025-07-10 |
| 651 | Data-Balanced Curriculum Learning for Audio Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current models struggle with dataset imbalancesand unstable training dynamics. This work combines curriculum learning withstatistical data balancing to address these challenges. |
Gijs Wijngaard; Elia Formisano; Michele Esposito; Michel Dumontier; | arxiv-cs.SD | 2025-07-09 |
| 652 | Enhancing Food-Domain Question Answering with A Multimodal Knowledge Graph: Hybrid QA Generation and Diversity Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a unified food-domain QA framework that combines a large-scalemultimodal knowledge graph (MMKG) with generative AI. |
Srihari K B; Pushpak Bhattacharyya; | arxiv-cs.CL | 2025-07-09 |
| 653 | Barriers in Integrating Medical Visual Question Answering Into Radiology Workflows: A Scoping Review and Clinicians’ Insights Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study systematically reviews 68publications (2018-2024) and surveys 50 clinicians from India and Thailand toexamine MedVQA’s practical utility, challenges, and gaps. |
Deepali Mishra; Chaklam Silpasuwanchai; Ashutosh Modi; Madhumita Sushil; Sorayouth Chumnanvej; | arxiv-cs.CL | 2025-07-09 |
| 654 | Enhancing Scientific Visual Question Answering Through Multimodal Reasoning and Ensemble Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conducted a series of experiments using models with 5B to 8B parameters.Our strongest individual model, InternVL3, achieved ROUGE-1 and ROUGE-L F1scores of \textbf{0.740} and a BERTScore of \textbf{0.983} on the SciVQA testsplit. |
Prahitha Movva; Naga Harshita Marupaka; | arxiv-cs.CV | 2025-07-08 |
| 655 | LLM-based Question-Answer Framework for Sensor-driven HVAC System Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inthis paper, we present JARVIS, a two-stage LLM-based QA framework tailored forsensor data-driven HVAC system interaction. |
SUNGMIN LEE et. al. | arxiv-cs.AI | 2025-07-07 |
| 656 | Building Open-Retrieval Conversational Question Answering Systems By Generating Synthetic Data and Decontextualizing User Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a pipeline thatcapitalizes on the abundance of plain text documents in organizations (e.g.,product documentation) to automatically produce realistic OR-CONVQA dialogswith annotations. |
CHRISTOS VLACHOS et. al. | arxiv-cs.CL | 2025-07-07 |
| 657 | Assessing The Capabilities and Limitations of FinGPT Model in Financial NLP Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work evaluates FinGPT, a financial domain-specific language model,across six key natural language processing (NLP) tasks: Sentiment Analysis,Text Classification, Named Entity Recognition, Financial Question Answering,Text Summarization, and Stock Movement Prediction. |
Prudence Djagba; Chimezie A. Odinakachukwu; | arxiv-cs.CL | 2025-07-06 |
| 658 | Beyond Independent Passages: Adaptive Passage Combination Retrieval for Retrieval Augmented Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Retrieval-augmented generation (RAG) enhances large language models (LLMs) byincorporating external documents at inference time, enabling up-to-dateknowledge access without costly retraining. |
Ting-Wen Ko; Jyun-Yu Jiang; Pu-Jen Cheng; | arxiv-cs.CL | 2025-07-05 |
| 659 | Guiding Audio-Visual Question Answering with Collective Question Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
BAOQI PEI et. al. | International Journal of Computer Vision | 2025-07-03 |
| 660 | Coling-UniA at SciVQA 2025: Few-Shot Example Retrieval and Confidence-Informed Ensembling for Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes our system for the SciVQA 2025 Shared Task on ScientificVisual Question Answering. |
Christian Jaumann; Annemarie Friedrich; Rainer Lienhart; | arxiv-cs.CL | 2025-07-03 |
| 661 | Chart Question Answering from Real-World Analytical Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a new dataset for chart question answering (CQA) constructed fromvisualization notebooks. |
Maeve Hutchinson; Radu Jianu; Aidan Slingsby; Jo Wood; Pranava Madhyastha; | arxiv-cs.CL | 2025-07-02 |
| 662 | OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Incontrast, we propose an end-to-end agentic framework that embeds multi-turntool calls-using a BM25+-based search API and a SQLite SQL executor-directlyinto a large language model. |
Zipeng Qiu; | arxiv-cs.CL | 2025-07-02 |
| 663 | Retriever-generator-verification: A Novel Approach to Enhancing Factual Coherence in Open-domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SHIQI SUN et. al. | Inf. Process. Manag. | 2025-07-01 |
| 664 | Enhancing SPARQL Query Generation for Question Answering with A Hybrid Encoder-decoder and Cross-attention Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yi-Hui Chen; Eric Jui-Lin Lu; Kwan-Ho Cheng; | J. Web Semant. | 2025-07-01 |
| 665 | Read The Docs Before Rewriting: Equip Rewriter with Domain Knowledge Via Continual Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in specialized domains,the rewriter model may struggle due to limited domain-specific knowledge. Toresolve this, we propose the R\&R (Read the doc before Rewriting) rewriter,which involves continual pre-training on professional documents, akin to howstudents prepare for open-book exams by reviewing textbooks. |
Qi Wang; Yixuan Cao; Yifan Liu; Jiangtao Zhao; Ping Luo; | arxiv-cs.IR | 2025-07-01 |
| 666 | A Knowledge Graph-enhanced Large Language Model for Question Answering of Hydraulic Structure Safety Management Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
DONGLIANG ZHANG et. al. | Adv. Eng. Informatics | 2025-07-01 |
| 667 | A RAG Approach for Multi-Modal Open-ended Lifelog Question-Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Lifelogging is the passive collection, storage and analysis of daily data through wearable sensors. Question Answering (QA) for lifelog data enables natural language interactions … |
QUANG-LINH TRAN et. al. | Proceedings of the 2025 International Conference on … | 2025-06-30 |
| 668 | MedEthicsQA: A Comprehensive Question Answering Benchmark for Medical Ethics Evaluation of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces $\textbf{MedEthicsQA}$, a comprehensivebenchmark comprising $\textbf{5,623}$ multiple-choice questions and$\textbf{5,351}$ open-ended questions for evaluation of medical ethics in LLMs.We systematically establish a hierarchical taxonomy integrating global medicalethical standards. |
JIANHUI WEI et. al. | arxiv-cs.CL | 2025-06-28 |
| 669 | DocVXQA: Context-Aware Visual Explanations for Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose **DocVXQA**, a novel framework for visually self-explainable document question answering, where the goal is not only to produce accurate answers to questions but also to learn visual heatmaps that highlight critical regions, offering interpretable justifications for the model decision. |
MOHAMED ALI SOUIBGUI et. al. | icml | 2025-06-25 |
| 670 | Knowledge-Aware Diverse Reranking for Cross-Source Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents Team Marikarp’s solution for the SIGIR 2025 LiveRAGcompetition. |
Tong Zhou; | arxiv-cs.CL | 2025-06-25 |
| 671 | Towards Probabilistic Question Answering Over Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a newbenchmark LUCARIO and a framework for probabilistic QA over large tabular data.Our method induces Bayesian Networks from tables, translates natural languagequeries into probabilistic queries, and uses large language models (LLMs) togenerate final answers. |
Chen Shen; Sajjadur Rahman; Estevam Hruschka; | arxiv-cs.CL | 2025-06-25 |
| 672 | 3D Question Answering Via Only 2D Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we explore how to harness their potential to address 3D scene understanding tasks, using 3D question answering (3D-QA) as a representative example. |
FENGYUN WANG et. al. | icml | 2025-06-25 |
| 673 | FactTest: Factuality Testing in Large Language Models with Finite-Sample and Distribution-Free Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce FactTest, a novel framework that statistically assesses whether an LLM can provide correct answers to given questions with high-probability correctness guarantees. |
FAN NIE et. al. | icml | 2025-06-25 |
| 674 | Understanding Complexity in VideoQA Via Visual Program Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a data-driven approach to analyzing query complexity in Video Question Answering (VideoQA). |
CRISTOBAL EYZAGUIRRE et. al. | icml | 2025-06-25 |
| 675 | ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this, we introduce the Time-Series Question Answering (Time-Series QA) task and release EngineMT-QA, the first large-scale, multi-task, temporal-textual QA dataset designed to capture complex interactions between time-series signals and natural language. Building on this resource, we propose the Instruct Time Transformer (ITFormer), a novel framework that bridges time-series encoders with frozen large language models (LLMs). |
YILIN WANG et. al. | icml | 2025-06-25 |
| 676 | MultiFinRAG: An Optimized Multimodal Retrieval-Augmented Generation (RAG) Framework for Financial Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MultiFinRAG, a retrieval-augmented generationframework purpose-built for financial QA. |
Chinmay Gondhalekar; Urjitkumar Patel; Fang-Chun Yeh; | arxiv-cs.CL | 2025-06-25 |
| 677 | Divide and Conquer: Exploring Language-centric Tree Reasoning for Video Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Video Question-Answering (VideoQA) remains challenging in achieving advanced cognitive reasoning due to the uncontrollable and opaque reasoning processes in existing Multimodal Large Language Models (MLLMs). To address this issue, we propose a novel Language-centric Tree Reasoning (LTR) framework that targets on enhancing the reasoning ability of models. |
ZHAOHE LIAO et. al. | icml | 2025-06-25 |
| 678 | TUMTraf VideoQA: Dataset and Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present TUMTraf VideoQA, a novel dataset and benchmark designed for spatio-temporal video understanding in complex roadside traffic scenarios. |
XINGCHENG ZHOU et. al. | icml | 2025-06-25 |
| 679 | Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Inference-Scaled GraphRAG, a novelframework that enhances LLM-based graph reasoning by applying inference-timecompute scaling. |
Travis Thompson; Seung-Hwan Lim; Paul Liu; Ruoying He; Dongkuan Xu; | arxiv-cs.CL | 2025-06-24 |
| 680 | Evaluating Large Language Models for Requirements Question Answering in Industrial Aerospace Software Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Aerospace software presents significant challenges to requirements engineering due to its design complexity and stringent safety standards. When manually drafting requirement … |
LONGXING YANG et. al. | Proceedings of the 33rd ACM International Conference on the … | 2025-06-23 |
| 681 | PDF Retrieval Augmented Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents an advancement in Question-Answering (QA) systems using aRetrieval Augmented Generation (RAG) framework to enhance informationextraction from PDF files. |
Thi Thu Uyen Hoang; Viet Anh Nguyen; | arxiv-cs.CL | 2025-06-22 |
| 682 | A Comprehensive Graph Framework for Question Answering with Mode-Seeking Preference Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent advancements in retrieval-augmented generation (RAG) have enhancedlarge language models in question answering by integrating external knowledge.However, challenges persist in achieving global understanding and aligningresponses with human ethical and quality preferences. To address these issues,we propose GraphMPA, a comprehensive graph-based framework with mode-seekingpreference alignment. |
QUANWEI TANG et. al. | arxiv-cs.CL | 2025-06-22 |
| 683 | UNITQA: A Unified Automated Tabular Question Answering System with Multi-Agent Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Automated tabular question answering (TQA) has attracted significant attention in data analysis and natural language processing communities due to its powerful capabilities. The … |
JUN-PENG ZHU et. al. | Companion of the 2025 International Conference on … | 2025-06-22 |
| 684 | MUPA: Towards Multi-Path Agentic Reasoning for Grounded Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose MUPA, a cooperativeMUlti-Path Agentic approach that unifies video grounding, question answering,answer reflection and aggregation to tackle Grounded VideoQA. |
JISHENG DANG et. al. | arxiv-cs.CV | 2025-06-22 |
| 685 | LastingBench: Defend Benchmarks Against Knowledge Leakage Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduceLastingBench, a novel framework designed to continuously reinforce andsafeguard existing benchmarks against knowledge leakage. |
Yixiong Fang; Tianran Sun; Yuling Shi; Min Wang; Xiaodong Gu; | arxiv-cs.CL | 2025-06-21 |
| 686 | Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes a novel frameworkcalled DEC (Dynamic Enhancement Chain). |
BINQUAN JI et. al. | arxiv-cs.CL | 2025-06-21 |
| 687 | ESapiens: A Real-World NLP Framework for Multimodal Document Understanding and Enterprise Knowledge Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce eSapiens, a unified question-answering system designed forenterprise settings, which bridges structured databases and unstructuredtextual corpora via a dual-module architecture. |
ISAAC SHI et. al. | arxiv-cs.IR | 2025-06-20 |
| 688 | RAGentA: Multi-Agent Retrieval-Augmented Generation for Attributed Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present RAGentA, a multi-agent retrieval-augmented generation (RAG)framework for attributed question answering (QA) with large language models(LLMs). |
Ines Besrour; Jingbo He; Tobias Schreieder; Michael Färber; | arxiv-cs.IR | 2025-06-20 |
| 689 | How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate whether off-the-shelf Multimodal Large Language Models (MLLMs)can tackle Online Episodic-Memory Video Question Answering (OEM-VQA) withoutadditional training. |
Giuseppe Lando; Rosario Forte; Giovanni Maria Farinella; Antonino Furnari; | arxiv-cs.CV | 2025-06-19 |
| 690 | Enhancing Document-Level Question Answering Via Multi-Hop Retrieval-Augmented Generation with LLaMA 3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a novel Retrieval-Augmented Generation (RAG) frameworktailored for complex question answering tasks, addressing challenges inmulti-hop reasoning and contextual understanding across lengthy documents.Built upon LLaMA 3, the framework integrates a dense retrieval module withadvanced context fusion and multi-hop reasoning mechanisms, enabling moreaccurate and coherent response generation. |
XINYUE HUANG et. al. | arxiv-cs.CL | 2025-06-19 |
| 691 | Evaluating Multimodal Large Language Models on Educational Textbook Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents the first evaluation ofstate-of-the-art MLLMs, including LLaVA-1.5 and LLaMA 3.2-Vision, on thetextbook question answering (TQA) task using the CK12-QA dataset. |
HESSA A. ALAWWAD et. al. | arxiv-cs.CL | 2025-06-18 |
| 692 | MEGC2025: Micro-Expression Grand Challenge on Spot Then Recognize and Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Facial micro-expressions (MEs) are involuntary movements of the face thatoccur spontaneously when a person experiences an emotion but attempts tosuppress or repress the facial … |
XINQI FAN et. al. | arxiv-cs.CV | 2025-06-18 |
| 693 | MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Most notably, existing methods overlook the distinction between factoid and non-factoid questions. To address these challenges, we propose \textbf{MinosEval}, a novel evaluation method that first distinguishes open-ended questions and then ranks candidate answers using different evaluation strategies. |
YONGQI FAN et. al. | arxiv-cs.CL | 2025-06-18 |
| 694 | A Multilingual Multimodal Medical Examination Dataset for Visual Question Answering in Healthcare Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Vision-Language Models (VLMs) excel in multimodal tasks, yet their effectiveness in specialized medical applications remains underexplored. Accurate interpretation of medical … |
GIUSEPPE RICCIO et. al. | 2025 IEEE 38th International Symposium on Computer-Based … | 2025-06-18 |
| 695 | Moment Sampling in Video LLMs for Long-Form Video QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, weinvestigate the use of a general-purpose text-to-video moment retrieval modelto guide the frame sampling process. |
MUSTAFA CHASMAI et. al. | arxiv-cs.CV | 2025-06-17 |
| 696 | Improving Multi-hop Question Answering with Prompting Explicit and Implicit Knowledge Aligned Human Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Guangming Huang; Yunfei Long; Cunjin Luo; | International Journal of Machine Learning and Cybernetics | 2025-06-16 |
| 697 | ArgHiTZ at ArchEHR-QA 2025: A Two-Step Divide and Conquer Approach to Patient Question Answering for Top Factuality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce an end-to-end prompt-based baseline and two two-step methods to divide the task, without utilizing any external knowledge. |
ADRIÁN CUADRÓN et. al. | arxiv-cs.CL | 2025-06-15 |
| 698 | Med-U1: Incentivizing Unified Medical Reasoning in LLMs Via Large-scale Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present Med-U1, a unified framework for robustreasoning across medical QA tasks with diverse output formats, ranging fromMCQs to complex generation and computation tasks. |
XIAOTIAN ZHANG et. al. | arxiv-cs.CL | 2025-06-13 |
| 699 | Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present Neural, the runner-up in the BioNLP 2025 ArchEHR-QA shared task on evidence-grounded clinical QA. |
SAI PRASANNA TEJA REDDY BOGIREDDY et. al. | arxiv-cs.LG | 2025-06-12 |
| 700 | Exploring The Potential of Multimodal Large Language Models for Question Answering on Artworks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper investigates the application of a Multimodal Large Language Model to enhance visitor experiences in cultural heritage settings through Visual Question Answering (VQA) … |
Alessio Ferrato; Carla Limongelli; Fabio Gasparetti; Giuseppe Sansonetti; A. Micarelli; | Adjunct Proceedings of the 33rd ACM Conference on User … | 2025-06-12 |
| 701 | Mind The Gap: Benchmarking LLM Uncertainty, Discrimination, and Calibration in Specialty-Aware Clinical QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we evaluate uncertainty estimation methods for clinical QAfocusing, for the first time, on eleven clinical specialties and six questiontypes, and across ten open-source LLMs (general-purpose, biomedical, andreasoning models). |
Alberto Testoni; Iacer Calixto; | arxiv-cs.CL | 2025-06-12 |
| 702 | Dynamic Double Space Tower Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods often have difficulty handling complex reasoning scenarios due to insufficient cross-modal interaction and capturing the entity spatial relationships in the image. \cite{huang2023adaptive}\cite{liu2021comparing}\cite{guibas2021adaptive}\cite{zhang2022vsa}We studied a brand-new approach to replace the attention mechanism in order to enhance the reasoning ability of the model and its understanding of spatial relationships.Specifically, we propose a dynamic bidirectional spatial tower, which is divided into four layers to observe the image according to the principle of human gestalt vision. |
Weikai Sun; Shijie Song; Han Wang; | arxiv-cs.CV | 2025-06-12 |
| 703 | Team Anotheroption at SemEval-2025 Task 8: Bridging The Gap Between Open-Source and Proprietary LLMs in Table QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a system developed for SemEval 2025 Task 8: Question Answering (QA) over tabular data. |
Nikolas Evkarpidi; Elena Tutubalina; | arxiv-cs.CL | 2025-06-11 |
| 704 | Token Constraint Decoding Improves Robustness on Question Answering for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce and evaluate Token Constraint Decoding (TCD). |
JUI-MING YAO et. al. | arxiv-cs.CL | 2025-06-11 |
| 705 | ICT-QA: Question Answering Over Multi-Modal Contexts Including Image, Chart, and Text Modalities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: For question answering in multi-modal contexts that include image, chart, and text modalities, a model must be proficient in understanding each individual modality. Furthermore, … |
YOUNGROK JANG et. al. | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 706 | VRAG: Retrieval-Augmented Video Question Answering for Long-Form Videos Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The rapid expansion of video data across various domains has heightened the demand for efficient retrieval and question-answering systems, particularly for long-form videos. … |
BAO TRAN GIA et. al. | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 707 | CadenceRAG: Context-Aware and Dependency-Enhanced Retrieval Augmented Generation for Holistic Video Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper addresses the challenging problem of holistic video understanding, focusing on rich-text-based video re-trieval and question answering. Compared to simple video … |
HENG LIU et. al. | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 708 | An LLM Framework for Long-Form Video Retrieval and Audio-Visual Question Answering Using Qwen2/2.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents our approach to tackle the tasks of Known Item Search (KIS) and Video Question Answering (Video QA) by combining state-of-the-art LLMs and cross-modal video … |
Damianos Galanopoulos; Andreas Goulas; Antonios Leventakis; Ioannis Patras; Vasileios Mezaris; | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 709 | Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-k Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Retrieval-augmented generation (RAG) and long-context language models (LCLMs) both address context limitations of LLMs in open-domain question answering (QA). However, optimal … |
Chihiro Taguchi; Seiji Maekawa; Nikita Bhutani; | ArXiv | 2025-06-10 |
| 710 | Improved LLM Agents for Financial Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Building upon this framework, this paper examines the effectiveness of the traditional critic agent when oracle labels are not available, and show, through experiments, that this critic agent’s performance deteriorates in this scenario. With this in mind, we present an improved critic agent, along with the calculator agent which outperforms the previous state-of-the-art approach (program-of-thought) and is safer. |
NELVIN TAN et. al. | arxiv-cs.CL | 2025-06-10 |
| 711 | Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-$k$ Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Retrieval-augmented generation (RAG) and long-context language models (LCLMs)both address context limitations of LLMs in open-domain question answering(QA). However, optimal … |
Chihiro Taguchi; Seiji Maekawa; Nikita Bhutani; | arxiv-cs.CL | 2025-06-10 |
| 712 | CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Medical question answering (QA) benchmarks often focus on multiple-choice orfact-based tasks, leaving open-ended answers to real patient questionsunderexplored. This gap is … |
YAHAN LI et. al. | arxiv-cs.CL | 2025-06-10 |
| 713 | Looking Beyond Visible Cues: Implicit Video Question Answering Via Dual-Clue Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To tackle I-VQA, we propose a novel reasoning framework, IRM (Implicit Reasoning Model), incorporating dual-stream modeling of contextual actions and intent clues as implicit reasoning chains. |
TIEYUAN CHEN et. al. | arxiv-cs.CV | 2025-06-09 |
| 714 | VidBridge-R1: Bridging QA and Captioning for RL-based Video Understanding Models with Intermediate Proxy Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Naively combining reward signals from these tasks results in mutualperformance degradation, which we attribute to a conflict between theiropposing task natures. To address this challenge, we propose a novel trainingframework built upon two intermediate proxy tasks: DarkEventInfer, whichpresents videos with masked event segments, requiring models to infer theobscured content based on contextual video cues; and MixVidQA, which presentsinterleaved video sequences composed of two distinct clips, challenging modelsto isolate and reason about one while disregarding the other. |
XINLONG CHEN et. al. | arxiv-cs.CV | 2025-06-09 |
| 715 | HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes a novel image-text reasoning supervised fine-tuning data generation pipeline to create specific domain quadruplets (image, question, thinking trace, and answer) from domain-specific materials. |
Shijie Wang; Yilun Zhang; Zeyu Lai; Dexing Kong; | arxiv-cs.AI | 2025-06-09 |
| 716 | ScIRGen: Synthesize Realistic and Large-Scale RAG Dataset for Scientific Research Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Scientific researchers need intensive information about datasets to effectively evaluate and develop theories and methodologies. The information needs regarding datasets are … |
JUNYONG LIN et. al. | ArXiv | 2025-06-09 |
| 717 | KG2QA: Knowledge Graph-enhanced Retrieval-augmented Generation for Communication Standards Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The rapid evolution of communication technologies has led to an explosion ofstandards, rendering traditional expert-dependent consultation methodsinefficient and slow. To address this challenge, we propose \textbf{KG2QA}, aquestion answering (QA) framework for communication standards that integratesfine-tuned large language models (LLMs) with a domain-specific knowledge graph(KG) via a retrieval-augmented generation (RAG) pipeline. |
Zhongze Luo; Weixuan Wan; Tianya Zhang; Dan Wang; Xiaoying Tang; | arxiv-cs.CL | 2025-06-08 |
| 718 | DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a Dual-vision Scene Perception Network (DSPNet), to comprehensively integrate multi-view and point cloud features to improve robustness in 3D QA. |
JINGZHOU LUO et. al. | cvpr | 2025-06-07 |
| 719 | BIMBA: Selective-Scan Compression for Long-Range Video Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce BIMBA, an efficient state-space model to handle long-form videos. |
Md Mohaiminul Islam; Tushar Nagarajan; Huiyu Wang; Gedas Bertasius; Lorenzo Torresani; | cvpr | 2025-06-07 |
| 720 | M-LLM Based Video Frame Selection for Efficient Video Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, it could lose crucial context in certain periods of a video, so that the downstream M-LLM may not have sufficient visual information to answer a question. To attack this pain point, we propose a light-weight M-LLM -based frame selection method that adaptively select frames that are more relevant to users’ queries. |
KAI HU et. al. | cvpr | 2025-06-07 |
| 721 | Zero-shot 3D Question Answering Via Voxel-based Dynamic Token Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Common methods such as token pooling, reduce visual token usage but often lead to information loss, impairing the model’s ability to preserve visual details essential for 3D question answering tasks. To address this, we propose voxel-based Dynamic Token Compression (DTC), which combines 3D spatial priors and visual semantics to achieve over 90% reduction in visual tokens usage for current multi-frame VLMs. |
HSIANG-WEI HUANG et. al. | cvpr | 2025-06-07 |
| 722 | VITED: Video Temporal Evidence Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, they lack the ability to temporally localize such evidence in the broader context of the full video, which is required for answering complex questions. We propose a framework to enhance existing VideoQA datasets with evidence reasoning chains, automatically constructed by searching for optimal intervals of interest in the video with supporting evidence, that maximizes the likelihood of answering a given question.We train our model (ViTED) to generate these evidence chains directly, enabling it to both localize evidence windows as well as perform multi-step reasoning across them in long-form video content.We show the value of our evidence-distilled models on a suite of long video QA benchmarks where we outperform state-of-the-art approaches that lack evidence reasoning capabilities. |
Yujie Lu; Yale Song; William Wang; Lorenzo Torresani; Tushar Nagarajan; | cvpr | 2025-06-07 |
| 723 | Learning to Clarify By Reinforcement Learning Through Reward-Weighted Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we learn to ask clarifying questions in QA agents. |
SUBHOJYOTI MUKHERJEE et. al. | arxiv-cs.CL | 2025-06-07 |
| 724 | CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an MLLMs-based dual momentum Mixture-of-Experts (\texttt CL-MoE ) framework for continual visual question answering. |
TIANYU HUAI et. al. | cvpr | 2025-06-07 |
| 725 | AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, a novel benchmark for audio-visual question answering continual learning (AVQACL) is introduced, aiming to study fine-grained scene understanding and spatial-temporal reasoning in videos under a continual learning setting. |
Kaixuan Wu; Xinde Li; Xinling Li; Chuanfei Hu; Guoliang Wu; | cvpr | 2025-06-07 |
| 726 | Cross-modal Causal Relation Alignment for Video Question Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a novel VideoQG framework named Cross-modal Causal Relation Alignment (CRA), to eliminate spurious correlations and improve the causal consistency between question-answering and video temporal grounding. |
WEIXING CHEN et. al. | cvpr | 2025-06-07 |
| 727 | EgoLife: Towards Egocentric Life Assistant Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce EgoLife, a project to develop an egocentric life assistant that accompanies and enhances personal efficiency through AI-powered wearable glasses. |
JINGKANG YANG et. al. | cvpr | 2025-06-07 |
| 728 | Unveiling The Mist Over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To unveil the "mist", we propose Beacon3D, a benchmark for 3D-VL grounding and QA tasks, delivering a perspective shift in the evaluation of 3D-VL understanding. |
JIANGYONG HUANG et. al. | cvpr | 2025-06-07 |
| 729 | Flexible Frame Selection for Efficient Video Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose the Flexible Frame Selector (FFS), a learnable policy model with a new flexible selection operation, that helps alleviate input context restrictions by enabling video-language models to focus on the most informative frames for the downstream multimodal task, without adding undue processing cost. |
Shyamal Buch; Arsha Nagrani; Anurag Arnab; Cordelia Schmid; | cvpr | 2025-06-07 |
| 730 | Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we transform the KBVQA into linguistic question-answering tasks so that we can leverage the rich world knowledge and strong reasoning abilities of Large Language Models (LLMs). |
ZHEN YANG et. al. | cvpr | 2025-06-07 |
| 731 | EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce EgoTextVQA, a novel and rigorously constructed benchmark for egocentric QA assistance involving scene text. |
SHENG ZHOU et. al. | cvpr | 2025-06-07 |
| 732 | Commonsense Video Question Answering Through Video-Grounded Entailment Tree Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes the first video-grounded entailment tree reasoning method for commonsense video question answering (VQA). |
Huabin Liu; Filip Ilievski; Cees G. M. Snoek; | cvpr | 2025-06-07 |
| 733 | EASG-Bench: Video Q&A Benchmark with Egocentric Action Scene Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce EASG-Bench, a question-answering benchmark for egocentric videoswhere the question-answering pairs are created from spatio-temporally groundeddynamic scene graphs capturing intricate relationships among actors, actions,and objects. |
IVAN RODIN et. al. | arxiv-cs.CV | 2025-06-06 |
| 734 | BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the gap, we present BioMol-MQA, a new question-answering (QA) dataset on polypharmacy, which is composed of two parts (i) a multimodal knowledge graph (KG) with text and molecular structure for information retrieval; and (ii) challenging questions that designed to test LLM capabilities in retrieving and reasoning over multimodal KG to answer questions. |
Saptarshi Sengupta; Shuhua Yang; Paul Kwong Yu; Fali Wang; Suhang Wang; | arxiv-cs.CL | 2025-06-06 |
| 735 | Tabular-textual Question Answering: From Parallel Program Generation to Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
XUSHUO TANG et. al. | World Wide Web | 2025-06-06 |
| 736 | UTSA-NLP at ArchEHR-QA 2025: Improving EHR Question Answering Via Self-Consistency Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We describe our system for the ArchEHR-QA Shared Task on answering clinical questions using electronic health records (EHRs). |
Sara Shields-Menard; Zach Reimers; Joshua Gardner; David Perry; Anthony Rios; | arxiv-cs.CL | 2025-06-05 |
| 737 | Trustworthy Medical Question Answering: An Evaluation-Centric Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we systematically examine six key dimensions of trustworthiness in medical QA, i.e., Factuality, Robustness, Fairness, Safety, Explainability, and Calibration. |
YINUO WANG et. al. | arxiv-cs.CL | 2025-06-04 |
| 738 | Plugging Schema Graph Into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existingmethods based on semantic similarity work well only on simplified hand-crafteddatasets and struggle to handle complex, real-world scenarios with numerous anddiverse columns. To address this, we propose a graph-based framework thatleverages human-curated relational knowledge to explicitly encode schema linksand join paths. |
Xixi Wang; Miguel Costa; Jordanka Kovaceva; Shuai Wang; Francisco C. Pereira; | arxiv-cs.AI | 2025-06-04 |
| 739 | Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a hierarchical question-answering (QA) approach for scene understanding in autonomous vehicles, balancing cost-efficiency with detailed visual interpretation. |
Safaa Abdullahi Moallim Mohamud; Minjin Baek; Dong Seog Han; | arxiv-cs.CV | 2025-06-03 |
| 740 | ESGenius: Benchmarking LLMs on Environmental, Social, and Governance (ESG) and Sustainability Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ESGenius, a comprehensive benchmark for evaluating and enhancingthe proficiency of Large Language Models (LLMs) in Environmental, Social, andGovernance (ESG) and sustainability-focused question answering. |
CHAOYUE HE et. al. | arxiv-cs.CL | 2025-06-02 |
| 741 | IQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Knowledge Base Question Answering (KBQA), which queries and reasonsover KGs, is central to this effort, especially for complex, multi-hop queries.However, multi-hop reasoning poses two key challenges: (1)~maintaining coherentreasoning paths, and (2)~avoiding prematurely discarding critical multi-hopconnections. To address these issues, we introduce iQUEST, a question-guidedKBQA framework that iteratively decomposes complex queries into simplersub-questions, ensuring a structured and focused reasoning trajectory.Additionally, we integrate a Graph Neural Network (GNN) to look ahead andincorporate 2-hop neighbor information at each reasoning step. |
Shuai Wang; Yinan Yu; | arxiv-cs.CL | 2025-06-02 |
| 742 | COMPKE: Complex Question Answering Under Knowledge Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, we argue that these benchmarks fail to effectively evaluate how well the updated models apply this knowledge in real-life scenarios, particularly when questions require complex reasoning, involving one-to-many relationships or multi-step logical intersections. To fill in this gap, we introduce a new benchmark, COMPKE: Complex Question Answering under Knowledge Editing, which includes 11,924 complex questions that reflect real-life situations. |
KEYUAN CHENG et. al. | arxiv-cs.CL | 2025-06-01 |
| 743 | MKGF: A Multi-modal Knowledge Graph Based RAG Framework to Enhance LVLMs for Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YINAN WU et. al. | Neurocomputing | 2025-06-01 |
| 744 | Efficient Multimodal Selection for Retrieval in Knowledge-Based Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Retrieval plays an important role in knowledge-based visual question answering (KB-VQA), which relies on external knowledge to answer questions related to an image. However, not … |
Linyin Luo; Hanjiang Lai; Yan Pan; Jian Yin; | IEEE Transactions on Circuits and Systems for Video … | 2025-06-01 |
| 745 | Bi-directional Dual Contrastive Adapting Method for Alleviating Hallucination in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
HAOLONG YAN et. al. | Expert Syst. Appl. | 2025-06-01 |
| 746 | Final: Combining First-Order Logic With Natural Logic for Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Many question-answering problems can be approached as textual entailment tasks, where the hypotheses are formed by the question and candidate answers, and the premises are derived … |
JIHAO SHI et. al. | IEEE Transactions on Knowledge and Data Engineering | 2025-06-01 |
| 747 | Explainable Medical Visual Question Answering Via Chain of Evidence Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
CHEN QIU et. al. | Knowl. Based Syst. | 2025-06-01 |
| 748 | Thoughtful and Cautious Reasoning: A Fine-tuned Knowledge Graph-based Multi-hop Question Answering Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yinghao Zheng; Ling Lu; Yang Hu; Yinong Chen; Aijuan Wang; | Eng. Appl. Artif. Intell. | 2025-06-01 |
| 749 | OntoRAG: Enhancing Question-Answering Through Automated Ontology Derivation from Unstructured Knowledge Bases Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces OntoRAG, an automated pipeline designed to derive ontologies from unstructured knowledge bases, with a focus on electrical relay documents. |
Yash Tiwari; Owais Ahmad Lone; Mayukha Pal; | arxiv-cs.AI | 2025-05-31 |
| 750 | Probing The Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Large language models (LLMs) are trained on extensive datasets that encapsulate substantial world knowledge. |
YUNTAI BAO et. al. | arxiv-cs.CL | 2025-05-31 |
| 751 | DeepRAG: Integrating Hierarchical Reasoning and Process Supervision for Biomedical Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose DeepRAG, a novel framework that integrates DeepSeek hierarchical question decomposition capabilities with RAG Gym unified retrieval-augmented generation optimization using process level supervision. |
YUELYU JI et. al. | arxiv-cs.CL | 2025-05-31 |
| 752 | Inter-Passage Verification for Multi-evidence Multi-answer QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Multi-answer question answering (QA), where questions can have many valid answers, presents a significant challenge for existing retrieval-augmented generation-based QA systems, as these systems struggle to retrieve and then synthesize a large number of evidence passages. To tackle these challenges, we propose a new multi-answer QA framework — Retrieval-augmented Independent Reading with Inter-passage Verification (RI$^2$VER). |
Bingsen Chen; Shengjie Wang; Xi Ye; Chen Zhao; | arxiv-cs.CL | 2025-05-31 |
| 753 | Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Grid-based Local and Global Area Transcription(Grid-LoGAT) system for Video Question Answering (VideoQA). |
MD INTISAR CHOWDHURY et. al. | arxiv-cs.CV | 2025-05-30 |
| 754 | Exploring The Impact of Occupational Personas on Domain-Specific QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study analyzes whether personas enhance specialized QA performance by introducing two types of persona: Profession-Based Personas (PBPs) (e.g., scientist), which directly relate to domain expertise, and Occupational Personality-Based Personas (OPBPs) (e.g., scientific person), which reflect cognitive tendencies rather than explicit expertise. |
Eojin Kang; Jaehyuk Yu; Juae Kim; | arxiv-cs.CL | 2025-05-30 |
| 755 | ComposeRAG: A Modular and Composable RAG for Corpus-Grounded Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ComposeRAG, a novel modular abstraction that decomposes RAG pipelines into atomic, composable modules. |
RUOFAN WU et. al. | arxiv-cs.CL | 2025-05-30 |
| 756 | TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing evaluation datasets are limited in scope and primarily text-based, lacking a unified and standardized multimodal question-answering (QA) benchmark. To address this issue, we introduce TCM-Ladder, the first multimodal QA dataset specifically designed for evaluating large TCM language models. |
JIACHENG XIE et. al. | arxiv-cs.CL | 2025-05-29 |
| 757 | MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce the MedPAIR (Medical Dataset Comparing Physicians and AI Relevance Estimation and Question Answering) dataset to evaluate how physician trainees and LLMs prioritize relevant information when answering QA questions. |
YUEXING HAO et. al. | arxiv-cs.CL | 2025-05-29 |
| 758 | Climate Finance Bench Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We curate 33 recent sustainability reports in English drawn from companies across all 11 GICS sectors and annotate 330 expert-validated question-answer pairs that span pure extraction, numerical reasoning, and logical reasoning. Building on this dataset, we propose a comparison of RAG (retrieval-augmented generation) approaches. |
RAFIK MANKOUR et. al. | arxiv-cs.CL | 2025-05-28 |
| 759 | Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on Mobile Intel CPUs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents an efficient transformer-based question-answering (QA) model optimized for deployment on a 13th Gen Intel i7-1355U CPU, using the Stanford Question Answering Dataset (SQuAD) v1.1. |
Ngeyen Yinkfu; | arxiv-cs.CL | 2025-05-28 |
| 760 | Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While these approaches have shown strong performance, they often rely on a single, generalized reasoning strategy, overlooking the unique characteristics of each modality ultimately limiting both accuracy and interpretability. To address these limitations, we propose MAMMQA, a multi-agent QA framework for multimodal inputs spanning text, tables, and images. |
Krishna Singh Rajput; Tejas Anvekar; Chitta Baral; Vivek Gupta; | arxiv-cs.CL | 2025-05-27 |
| 761 | Faithfulness-Aware Uncertainty Quantification for Fact-Checking The Output of Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce FRANQ (Faithfulness-based Retrieval Augmented UNcertainty Quantification), a novel method for hallucination detection in RAG outputs. |
EKATERINA FADEEVA et. al. | arxiv-cs.CL | 2025-05-27 |
| 762 | It’s High Time: A Survey of Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we provide a comprehensive overview of TemporalQuestion Answering (TQA), a research area that focuses on answering questionsinvolving temporal constraints or context. |
Bhawna Piryani; Abdelrahman Abdallah; Jamshid Mozafari; Avishek Anand; Adam Jatowt; | arxiv-cs.CL | 2025-05-26 |
| 763 | Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Subset-specific performance showed Gemini 2.0 Flash excelled in Binary_CN (0.687), Single-choice_CN (0.666), and Single-choice_EN (0.646), while GPT-4o ranked highest in Binary_EN (0.717), Open-ended_CN (BLEU-1: 0.301; BERTScore: 0.382), and Open-ended_EN (BLEU-1: 0.183; BERTScore: 0.240). Conclusions: This study presents the first bilingual VQA benchmark for ophthalmology, distinguished by its real-world context and inclusion of multiple examinations per patient. |
PUSHENG XU et. al. | arxiv-cs.CV | 2025-05-26 |
| 764 | GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While synthetic data generation has emerged as a promising solution, existing approaches frequently suffer from factual inaccuracies, insufficient long-tail coverage, simplistic knowledge structures, and homogenized outputs. To address these challenges, we introduce GraphGen, a knowledge graph-guided framework designed for three key question-answering (QA) scenarios: atomic QA, aggregated QA, and multi-hop QA. |
ZIHONG CHEN et. al. | arxiv-cs.CL | 2025-05-26 |
| 765 | AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Bias linked to race, sex, and socioeconomic status is already well known, but a consistent and automatic testbed for measuring it is missing. To fill this gap, this paper presents AMQA — an Adversarial Medical Question-Answering dataset — built for automated, large-scale bias evaluation of LLMs in medical QA. |
YING XIAO et. al. | arxiv-cs.AI | 2025-05-26 |
| 766 | Automated Text-to-Table for Reasoning-Intensive Table QA: Pipeline Design and Benchmarking Insights Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing research is constrained by two primary bottlenecks: 1) Reliance on costly manually annotated real-world data, which is difficult to cover complex reasoning scenarios; 2) The heterogeneity of table structures hinders systematic analysis of the intrinsic mechanisms behind the underperformance of LLMs, especially in reasoning-intensive tasks. To address these issues, we propose an automated generation pipeline AutoT2T that transforms mathematical word problems into table-based reasoning tasks, eliminating the need for manual annotation. |
SHI-YU TIAN et. al. | arxiv-cs.AI | 2025-05-26 |
| 767 | Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a discourse-awarehierarchical framework that leverages rhetorical structure theory (RST) toenhance long document question answering. |
Huiyao Chen; Yi Yang; Yinghui Li; Meishan Zhang; Min Zhang; | arxiv-cs.IR | 2025-05-26 |
| 768 | Hypercube-Based Retrieval-Augmented Generation for Scientific Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thiswork, we introduce a multi-dimensional (cube) structure, Hypercube, which canindex and allocate documents in a pre-defined multi-dimensional space. |
JIMENG SHI et. al. | arxiv-cs.LG | 2025-05-25 |
| 769 | GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel four-stage framework called Grounding Caption-Guided Knowledge-Based Visual Question Answering (GC-KBVQA), which enables LLMs to effectively perform zero-shot VQA tasks without the need for end-to-end multimodal training. |
Mohammad Mahdi Moradi; Sudhir Mudur; | arxiv-cs.CL | 2025-05-25 |
| 770 | SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce SpokenNativQA, the first multilingual and culturally aligned spoken question-answering (SQA) dataset designed to evaluate LLMs in real-world conversational settings. |
Firoj Alam; Md Arid Hasan; Shammur Absar Chowdhury; | arxiv-cs.CL | 2025-05-25 |
| 771 | MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, most existing methods adopt cross-modal promptisolation, constructing visual and textual prompts separately, whichexacerbates modality imbalance and leads to degraded performance over time. Totackle this issue, we propose MM-Prompt, a novel framework incorporatingcross-modal prompt query and cross-modal prompt recovery. |
Xu Li; Fan Lyu; | arxiv-cs.CV | 2025-05-25 |
| 772 | Toward Human Centered Interactive Clinical Question Answering System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces an interactive QA system that enables physicians to query clinical notes via text or voice and receive extractive answers highlighted directly in the note for traceability. |
Dina Albassam; | arxiv-cs.HC | 2025-05-24 |
| 773 | Enhancing Large Vision-Language Models with Layout Modality for Table Question Answering on Japanese Annual Securities Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we propose a method to enhance LVLM-based table understanding by incorporating in-table textual content and layout features. |
Hayato Aida; Kosuke Takahashi; Takahiro Omi; | arxiv-cs.CL | 2025-05-23 |
| 774 | How Knowledge Popularity Influences and Enhances LLM Knowledge Boundary Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate how knowledge popularity affects LLMs’ ability to perceive their knowledge boundaries. |
Shiyu Ni; Keping Bi; Jiafeng Guo; Xueqi Cheng; | arxiv-cs.CL | 2025-05-23 |
| 775 | PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite recent advances in large language models (LLMs) for medical QA, consumer-oriented and multilingual resources, particularly in low-resource languages like Persian, remain sparse. To bridge this gap, we present PerMedCQA, the first Persian-language benchmark for evaluating LLMs on real-world, consumer-generated medical questions. |
Naghmeh Jamali; Milad Mohammadi; Danial Baledi; Zahra Rezvani; Hesham Faili; | arxiv-cs.CL | 2025-05-23 |
| 776 | A Question-type Guided and Progressive Self-attention Network for Remote Sensing Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jiangfan Feng; Hui Wang; Shaokang Dong; | Earth Sci. Informatics | 2025-05-22 |
| 777 | O$^2$-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Open-ended questions, which characterized by lacking a standard answer or providing non-unique and diverse answers, remain underexplored. To bridge this gap, we present O$^2$-Searcher, a novel search agent leveraging reinforcement learning to effectively tackle both open-ended and closed-ended questions in the open domain. |
JIANBIAO MEI et. al. | arxiv-cs.CL | 2025-05-22 |
| 778 | T$^2$: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: But they add human bias to the reasoning process and fail to leverage models’ inherent reasoning capabilities. To address these limitations, we present T$^2$: Think-to-Think, a novel framework that dynamically adapts reasoning depth based on question complexity. |
ZHENGYI ZHAO et. al. | arxiv-cs.CL | 2025-05-22 |
| 779 | CT-Agent: A Multimodal-LLM Agent for 3D CT Radiology Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing VQA systems cannot adequately handle the CT radiology question answering (CTQA) task for: (1) anatomic complexity makes CT images difficult to understand; (2) spatial relationship across hundreds slices is difficult to capture. To address these issues, this paper proposes CT-Agent, a multimodal agentic framework for CTQA. |
Yuren Mao; Wenyi Xu; Yuyang Qin; Yunjun Gao; | arxiv-cs.CV | 2025-05-22 |
| 780 | Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce MMDocRAG, a comprehensive benchmark featuring 4,055 expert-annotated QA pairs with multi-page, cross-modal evidence chains. |
KUICAI DONG et. al. | arxiv-cs.IR | 2025-05-22 |
| 781 | O2-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Models (LLMs), despite their advancements, are fundamentally limited by their static parametric knowledge, hindering performance on tasks requiring open-domain … |
JIANBIAO MEI et. al. | ArXiv | 2025-05-22 |
| 782 | Teaching Large Language Models to Maintain Contextual Faithfulness Via Synthetic Tasks and Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Notably, Dual-GRPO eliminates the need tomanually label preference data to train reward models and avoidsover-optimizing short-form generation when relying only on the synthesizedshort-form QA data. |
SHUZHENG SI et. al. | arxiv-cs.CL | 2025-05-22 |
| 783 | ViQAgent: Zero-Shot Video Question Answering Via Agent with Open-Vocabulary Grounding Validation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Recent advancements in Video Question Answering (VideoQA) have introduced LLM-based agents, modular frameworks, and procedural solutions, yielding promising results. |
Tony Montes; Fernando Lozano; | arxiv-cs.CV | 2025-05-21 |
| 784 | LiveVLM: Efficient Online Video Understanding Via Streaming-Oriented KV Cache and Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Nonetheless, studies predominantly focus on offline video question answering, neglecting memory usage and response speed that are essential in various real-world applications, such as Deepseek services, autonomous driving, and robotics. To mitigate these challenges, we propose $\textbf{LiveVLM}$, a training-free framework specifically designed for streaming, online video understanding and real-time interaction. |
ZHENYU NING et. al. | arxiv-cs.CV | 2025-05-21 |
| 785 | Social Bias in Popular Question-Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We perform a qualitative content analysis of 30 benchmark papers and a quantitative analysis of 20 respective benchmark datasets to learn (1) who is involved in the benchmark creation, (2) how social bias is addressed or prevented, and (3) whether the demographics of the creators and annotators correspond to particular biases in the content. |
Angelie Kraft; Judith Simon; Sonja Schimmler; | arxiv-cs.CL | 2025-05-21 |
| 786 | Visual Question Answering on Multiple Remote Sensing Image Modalities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose to add multiple image modalities to VQA in the particular context of remote sensing, leading to a novel task for the computer vision community. |
HICHEM BOUSSAID et. al. | arxiv-cs.CV | 2025-05-21 |
| 787 | A Comprehensive Evaluation of Embedding Models and LLMs for IR and QA Across English and Italian Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study presents a comprehensive evaluation of embedding techniques and large language models (LLMs) for Information Retrieval (IR) and question answering (QA) across … |
Ermelinda Oro; Francesco Maria Granata; M. Ruffolo; | Big Data Cogn. Comput. | 2025-05-21 |
| 788 | StepSearch: Igniting LLMs Search Ability Via Step-Wise Proximal Policy Optimization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Previous work has explored reinforcement learning (RL) to train LLMs to perform search-based document retrieval, achieving notable improvements in QA performance, but underperform on complex, multi-hop QA resulting from the sparse rewards from global signal only. To address this gap in existing research, we introduce StepSearch, a framework for search LLMs that trained with step-wise proximal policy optimization method. |
ZILIANG WANG et. al. | arxiv-cs.CL | 2025-05-21 |
| 789 | KaFT: Knowledge-aware Fine-tuning for Boosting LLMs’ Domain-specific Question-Answering Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We find that 1) training samples with varied conflicts contribute differently, where SFT on the data with large conflicts leads to catastrophic performance drops; 2) compared to directly filtering out the conflict data, appropriately applying the conflict data would be more beneficial. Motivated by this, we propose a simple-yet-effective Knowledge-aware Fine-tuning (namely KaFT) approach to effectively boost LLMs’ performance. |
QIHUANG ZHONG et. al. | arxiv-cs.CL | 2025-05-21 |
| 790 | QA-prompting: Improving Summarization with Large Language Models Using Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, models often struggle with long-context summarization due topositional biases, leading to suboptimal extraction of critical information.There are techniques to improve this with fine-tuning, pipelining, or usingcomplex techniques, which have their own challenges. To solve these challenges,we propose QA-prompting – a simple prompting method for summarization thatutilizes question-answering as an intermediate step prior to summarygeneration. |
Neelabh Sinha; | arxiv-cs.CL | 2025-05-20 |
| 791 | CRAFT: Training-Free Cascaded Retrieval for Tabular QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our approach achieves better retrieval performance than state-of-the-art (SOTA) sparse, dense, and hybrid retrievers. |
Adarsh Singh; Kushal Raj Bhandari; Jianxi Gao; Soham Dan; Vivek Gupta; | arxiv-cs.CL | 2025-05-20 |
| 792 | Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Mujica, aMulti-hop Joint Intelligence for Complex Question Answering, comprising aplanner that decomposes questions into a directed acyclic graph of subquestionsand a worker that resolves questions via retrieval and reasoning. |
YIHONG WU et. al. | arxiv-cs.CL | 2025-05-20 |
| 793 | VoQA: Visual-only Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Visual-only Question Answering (VoQA), a novel multimodal task in which questions are visually embedded within images, without any accompanying textual input. |
Luyang Jiang; Jianing An; Jie Luo; Wenjun Wu; Lei Huang; | arxiv-cs.CV | 2025-05-20 |
| 794 | Automatic Dataset Generation for Knowledge Intensive Question Answering Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A novel approach for enhancing Large Language Models (LLMs) in knowledge-intensive QA tasks is presented through the automated generation of context-based QA pairs. |
Sizhe Yuen; Ting Su; Ziyang Wang; Yali Du; Adam J. Sobey; | arxiv-cs.CL | 2025-05-20 |
| 795 | Beyond Chains: Bridging Large Language Models and Knowledge Bases in Complex Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by semantic parsing methods, we propose PDRR: a four-stage framework consisting of Predict, Decompose, Retrieve, and Reason. |
Yihua Zhu; Qianying Liu; Akiko Aizawa; Hidetoshi Shimodaira; | arxiv-cs.CL | 2025-05-20 |
| 796 | Texts or Images? A Fine-grained Analysis on The Effectiveness of Input Representations and Models for Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we conduct the first controlled study on the effectiveness of several combinations of table representations and models from two perspectives: question complexity and table size. |
Wei Zhou; Mohsen Mesgar; Heike Adel; Annemarie Friedrich; | arxiv-cs.CL | 2025-05-20 |
| 797 | YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce YESciEval, an open-source framework that combines fine-grained rubric-based assessment with reinforcement learning to mitigate optimism bias in LLM evaluators. |
Jennifer D’Souza; Hamed Babaei Giglou; Quentin Münch; | arxiv-cs.CL | 2025-05-20 |
| 798 | Domain Adaptation of VLM for Soccer Video Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Vision Language Models (VLMs) have demonstrated strong performance inmulti-modal tasks by effectively aligning visual and textual representations.However, most video understanding VLM research has been domain-agnostic,leaving the understanding of their transfer learning capability to specializeddomains under-explored. In this work, we address this by exploring theadaptability of open-source VLMs to specific domains, and focusing on soccer asan initial case study. |
Tiancheng Jiang; Henry Wang; Md Sirajus Salekin; Parmida Atighehchian; Shinan Zhang; | arxiv-cs.CV | 2025-05-19 |
| 799 | Q${}^2$Forge: Minting Competency Questions and SPARQL Queries for Question-Answering Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present Q${}^2$Forge that addresses the challenge of generating new competency questions for a KG and corresponding SPARQL queries. |
Yousouf Taghzouti; Franck Michel; Tao Jiang; Louis-Félix Nothias; Fabien Gandon; | arxiv-cs.DB | 2025-05-19 |
| 800 | Structured Retrieval-Augmented Generation for Multi-Entity Question Answering Over Heterogeneous Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Teng Lin; | 2025 IEEE 41st International Conference on Data Engineering … | 2025-05-19 |
| 801 | FeVisQA: Free-Form Question Answering Over Data Visualizations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new task named FeVisQA, referring to Free-form Question Answering over data Visualizations. |
YUANFENG SONG et. al. | icde | 2025-05-19 |
| 802 | Disambiguation in Conversational Question Answering in The Era of LLMs and Agents: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By offering acomprehensive review of current research on ambiguities and disambiguation withLLMs, we aim to contribute to the development of more robust and reliableLLM-based systems. |
MD MEHRAB TANJIM et. al. | arxiv-cs.CL | 2025-05-18 |
| 803 | SurveillanceVQA-589K: A Benchmark for Comprehensive Surveillance Video-Language Understanding with Large Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce SurveillanceVQA-589K, the largest open-ended video question answering benchmark tailored to the surveillance domain. |
BO LIU et. al. | arxiv-cs.CV | 2025-05-18 |
| 804 | Beyond Retrieval: Joint Supervision and Multimodal Document Ranking for Textbook Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel approach to multimodal textbook question answering by introducing a mechanism for enhancing semantic representations through multi-objective joint training. |
Hessa Alawwad; Usman Naseem; Areej Alhothali; Ali Alkhathlan; Amani Jamal; | arxiv-cs.IR | 2025-05-17 |
| 805 | Recursive Question Understanding for Complex Question Answering Over Heterogeneous Personal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present ReQAP, a novel method that creates an executable operator tree for a given question, via recursive decomposition. |
Philipp Christmann; Gerhard Weikum; | arxiv-cs.CL | 2025-05-17 |
| 806 | A Dataset for Spatiotemporal-Sensitive POI Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, existing Question-Answering (QA) datasets lack sufficient spatiotemporal-sensitive questions, making them inadequate benchmarks for evaluating models’ spatiotemporal reasoning capabilities. To address this gap, we introduce POI-QA, a novel spatiotemporal-sensitive QA dataset centered on Point of Interest (POI), constructed through three key steps: mining and aligning open-source vehicle trajectory data from GAIA with high-precision geographic POI data, rigorous manual validation of noisy spatiotemporal facts, and generating bilingual (Chinese/English) QA pairs that reflect human-understandable spatiotemporal reasoning tasks. |
XIAO HAN et. al. | arxiv-cs.LG | 2025-05-16 |
| 807 | THELMA: Task Based Holistic Evaluation of Large Language Model Applications-RAG Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose THELMA (Task Based Holistic Evaluation of Large Language Model Applications), a reference free framework for RAG (Retrieval Augmented generation) based question answering (QA) applications. |
UDITA PATEL et. al. | arxiv-cs.CL | 2025-05-16 |
| 808 | Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel semantic caching approach for storing and reusing intermediate contextual summaries, enabling efficient information reuse across similar queries in LLM-based QA workflows. |
Camille Couturier; Spyros Mastorakis; Haiying Shen; Saravan Rajmohan; Victor Rühle; | arxiv-cs.CL | 2025-05-16 |
| 809 | CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they still face challenges in balancing retrieval precision and recall, impacting their efficacy in answering questions. To address this, we introduce $\textbf{CAFE}$, a two-stage coarse-to-fine method to enhance multi-document question-answering capacities. |
Han Peng; Jinhao Jiang; Zican Dong; Wayne Xin Zhao; Lei Fang; | arxiv-cs.CL | 2025-05-15 |
| 810 | Enhancing Multi-Image Question Answering Via Submodular Subset Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose an enhancement for retriever framework introduced in MIRAGE model using submodular subset selection techniques. |
Aaryan Sharma; Shivansh Gupta; Samar Agarwal; Vishak Prasad C.; Ganesh Ramakrishnan; | arxiv-cs.CV | 2025-05-15 |
| 811 | WixQA: A Multi-Dataset Benchmark for Enterprise Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Critically, evaluating end-to-end RAG systems requires benchmarks comprising not only question–answer pairs but also the specific knowledge base (KB) snapshot from which answers were derived. To address this need, we introduce WixQA, a benchmark suite featuring QA datasets precisely grounded in the released KB corpus, enabling holistic evaluation of retrieval and generation components. |
DVIR COHEN et. al. | arxiv-cs.AI | 2025-05-13 |
| 812 | Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Task 5 of the DCASE 2025 Challenge: an Audio Question Answering (AQA) benchmark spanning multiple domains of sound understanding. |
CHAO-HAN HUCK YANG et. al. | arxiv-cs.SD | 2025-05-12 |
| 813 | Efficient and Reproducible Biomedical Question Answering Using Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study systematically examines a Retrieval-Augmented Generation (RAG) system for biomedical QA, evaluating retrieval strategies and response time trade-offs. |
Linus Stuhlmann; Michael Alexander Saxer; Jonathan Fürst; | arxiv-cs.IR | 2025-05-12 |
| 814 | NeoQA: Evidence-based Question Answering with Generated News Events Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Questions initially requiring retrieval may become answerable from pretraining knowledge as newer models incorporate more recent information during pretraining, making it difficult to distinguish evidence-based reasoning from recall. We introduce NeoQA (News Events for Out-of-training Question Answering), a benchmark designed to address this issue. |
Max Glockner; Xiang Jiang; Leonardo F. R. Ribeiro; Iryna Gurevych; Markus Dreyer; | arxiv-cs.CL | 2025-05-09 |
| 815 | ChartCitor: Answer Citations for ChartQA Via Multi-Agent LLM Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large Language Models (LLMs) can perform chart question answering tasks but often generate unverified hallucinated responses. Existing answer attribution methods struggle to … |
Kanika Goswami; Puneet Mathur; R. Rossi; Franck Dernoncourt; | Companion Proceedings of the ACM on Web Conference 2025 | 2025-05-08 |
| 816 | A Multi-Agent Framework for Multi-Source Manufacturing Knowledge Integration and Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The integration of next-generation information technologies in the manufacturing sector has resulted in the generation of extensive and complex knowledge data by organizations. … |
Teng Mao; Shuangtao Yang; Bo Fu; | Companion Proceedings of the ACM on Web Conference 2025 | 2025-05-08 |
| 817 | StructRAG: Structure-Aware RAG Framework with Scholarly Knowledge Graph for Diverse Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent advances in Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) have shown promise in academic question answering. However, existing approaches often fail … |
Runsong Jia; Bowen Zhang; Sergio José Rodríguez Méndez; Pouya Ghiasnezhad Omran; | Companion Proceedings of the ACM on Web Conference 2025 | 2025-05-08 |
| 818 | IP-VQA Dataset: Empowering Precision Agriculture with Autonomous Insect Pest Management Through Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Precision agriculture is essential for social good, global economy and food security, yet insect pests threaten productivity through crop damage, pathogen spread, and rising pest … |
Kairui Jin; Xing Zi; Karthick Thiyagarajan; Ali Braytee; Mukesh Prasad; | Companion Proceedings of the ACM on Web Conference 2025 | 2025-05-08 |
| 819 | Fine-Tuning Large Language Models and Evaluating Retrieval Methods for Improved Question Answering on Building Codes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study focuses on identifying a suitable retriever method for building codes and optimizing the generational capability of the language model using fine-tuning techniques. |
Mohammad Aqib; Mohd Hamza; Qipei Mei; Ying Hei Chui; | arxiv-cs.CL | 2025-05-07 |
| 820 | IndicSQuAD: A Comprehensive Multilingual Question Answering Dataset for Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present IndicSQuAD, a comprehensive multi-lingual extractive QA dataset covering nine major Indic languages, systematically derived from the SQuAD dataset. |
Sharvi Endait; Ruturaj Ghatage; Aditya Kulkarni; Rajlaxmi Patil; Raviraj Joshi; | arxiv-cs.CL | 2025-05-06 |
| 821 | A Reasoning-Focused Legal Retrieval Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: An obstacle to the development of specialized RAG systems is the lack of realistic legal RAG benchmarks which capture the complexity of both legal retrieval and downstream legal question-answering. To address this, we introduce two novel legal RAG benchmarks: Bar Exam QA and Housing Statute QA. |
LUCIA ZHENG et. al. | arxiv-cs.CL | 2025-05-06 |
| 822 | LLM-Driven Data Augmentation for Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Remote Sensing Visual Question Answering (RSVQA) is a task aiming at automatic answering questions related to overhead imagery. Many studies have been conducted in recent years, … |
Hichem Boussaid; Nayoung Kwon; Camille Kurtz; Laurent Wendling; Sylvain Lobry; | 2025 Joint Urban Remote Sensing Event (JURSE) | 2025-05-05 |
| 823 | Structure Causal Models and LLMs Integration in Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a causal inference framework for the MedVQA task, which effectively eliminates the relative confounding effect between the image and the question to ensure the precision of the question-answering (QA) session. |
Zibo Xu; Qiang Li; Weizhi Nie; Weijie Wang; Anan Liu; | arxiv-cs.CV | 2025-05-05 |
| 824 | PeerQA: A Scientific Question Answering Dataset from Peer Reviews Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present PeerQA, a real-world, scientific, document-level Question Answering (QA) dataset. |
Tim Baumgärtner; Ted Briscoe; Iryna Gurevych; | naacl | 2025-05-04 |
| 825 | CuriousLLM: Elevating Multi-Document Question Answering with LLM-Enhanced Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose CuriousLLM, an enhancement that integrates a curiosity-driven reasoning mechanism into an LLM agent. |
Zukang Yang; Zixuan Zhu; Jennifer Zhu; | naacl | 2025-05-04 |
| 826 | SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, adapting general-purpose RAG systems to specialized fields such as science and medicine poses unique challenges due to distribution shifts and limited access to domain-specific data. To tackle this, we propose SimRAG, a self-training approach that equips LLMs with joint capabilities of question answering and question generation for domain adaptation. |
RAN XU et. al. | naacl | 2025-05-04 |
| 827 | Automatic Evaluation of Healthcare LLMs Beyond Question-Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work is focused on the healthcare domain, where both factuality and discourse matter greatly. It introduces a comprehensive, multi-axis suite for healthcare LLM evaluation, exploring correlations between open and close benchmarks and metrics. |
ANNA ARIAS-DUART et. al. | naacl | 2025-05-04 |
| 828 | ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a novel evaluation dataset, ProMQA, to measure the advancement of systems in application-oriented scenarios. |
KIMIHIRO HASEGAWA et. al. | naacl | 2025-05-04 |
| 829 | VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose VisDoMRAG, a novel multimodal Retrieval Augmented Generation (RAG) approach that simultaneously utilizes visual and textual RAG, combining robust visual retrieval capabilities with sophisticated linguistic reasoning. |
MANAN SURI et. al. | naacl | 2025-05-04 |
| 830 | MAPWise: Evaluating Vision-Language Models for Advanced Map Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the efficacy of VLMs in answering questions based on choropleth maps, which are widely used for data analysis and representation. To facilitate and encourage research in this area, we introduce a novel map-based question-answering benchmark, consisting of maps from three geographical regions (United States, India, China), each containing around 1000 questions. |
SRIJA MUKHOPADHYAY et. al. | naacl | 2025-05-04 |
| 831 | Diversify-verify-adapt: Efficient and Robust Retrieval-Augmented Ambiguous Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although the iterative RAG approach has been proposed to address this problem, it comes at the cost of significantly reduced efficiency. To address these issues, we propose the diversify-verify-adapt (DIVA) framework. |
YEONJUN IN et. al. | naacl | 2025-05-04 |
| 832 | CoRAC: Integrating Selective API Document Retrieval with Question Semantic Intent for Code Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a knowledge-based framework, CoRAC, an automatic code question responder that enhances understanding through selective API document retrieval and question semantic intent clustering. |
YunSeok Choi; CheolWon Na; Jee-Hyong Lee; | naacl | 2025-05-04 |
| 833 | Pointwise Mutual Information As A Performance Gauge for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, there is no method to date that exploits this phenomenon to improve generation. To fill this gap, in this study, we show that the pointwise mutual information between a context and a question is an effective gauge for language model performance. |
TIANYU LIU et. al. | naacl | 2025-05-04 |
| 834 | VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities Via Single-Stage Joint Speech-Text Supervised Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Another critical challenge with SpeechLMs is catastrophic forgetting, where models optimized for speech tasks suffer significant degradation in text-only performance. To mitigate these issues, we propose a novel single-stage joint speech-text SFT approach on the low-rank adaptation (LoRA) of the LLM backbone. |
YIFAN PENG et. al. | naacl | 2025-05-04 |
| 835 | Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Moreover, the lack of reference explanations means we cannot easily evaluate the reasoning of model decisions, a crucial component of supporting doctors in making complex medical decisions. To address these challenges, we construct two new datasets: JAMA Clinical Challenge and Medbullets. |
Hanjie Chen; Zhouxiang Fang; Yash Singla; Mark Dredze; | naacl | 2025-05-04 |
| 836 | Reverse Question Answering: Can An LLM Write A Question So Hard (or Bad) That It Can’t Answer? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By finding question and answer types that lead to RQA errors, we suggest improvements for LLM reasoning. |
NISHANT BALEPUR et. al. | naacl | 2025-05-04 |
| 837 | DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing zero-shot methods are efficient but failto consider context and prevent bias propagation in the answers. To address this, we propose *DeCAP*, a method for debiasing LLMs usingContext-Adaptive Prompt Generation. |
Suyoung Bae; YunSeok Choi; Jee-Hyong Lee; | naacl | 2025-05-04 |
| 838 | From Generating Answers to Building Explanations: Integrating Multi-Round RAG and Causal Modeling for Scientific QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by findings from the social sciences, we present an implemented causal QA approach that combines iterative RAG with guidance from a formal model of causation. |
VICTOR BARRES et. al. | naacl | 2025-05-04 |
| 839 | THREAD: Thinking Deeper with Recursive Spawning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Large language models (LLMs) have shown impressive capabilities across diverse settings, but still struggle as the length and complexity of the context increases. To address this challenge, we propose Thinking Recursively and Dynamically (ThReaD). |
Philip Schroeder; Nathaniel W. Morgan; Hongyin Luo; James R. Glass; | naacl | 2025-05-04 |
| 840 | MoEMoE: Question Guided Dense and Scalable Sparse Mixture-of-Expert for Multi-source Multi-modal Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we formulate a novel question-answer generation (QAG) framework in an environment containing multi-source, multimodal information. |
Vinay Kumar Verma; Shreyas Sunil Kulkarni; Happy Mittal; Deepak Gupta; | naacl | 2025-05-04 |
| 841 | MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Current benchmarks primarily focus on single-chart tasks, neglecting the multi-hop reasoning required to extract and integrate information from multiple charts, which is essential in practical applications. To fill this gap, we introduce MultiChartQA, a benchmark that evaluates MLLMs’ capabilities in four key areas: direct question answering, parallel question answering, comparative reasoning, and sequential reasoning. |
Zifeng Zhu; Mengzhao Jia; Zhihan Zhang; Lang Li; Meng Jiang; | naacl | 2025-05-04 |
| 842 | K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
Jeonghun Cho; Gary Lee; | naacl | 2025-05-04 |
| 843 | Generating Complex Question Decompositions in The Face of Distribution Shifts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: One way of improving LLM training and fine-tuning is to leverage synthetic training data, but the superior performance of supervised approaches collapses in the face of distribution shifts, making them unsuitable for generating synthetic data across new domains and at scale. To address this, we propose an approach to generate synthetic decomposition data with only five annotated examples; we do this by (i) extending recent advancements in using LLM-as-judge and for reranking in novel ways, as well as (ii) using a panel of smaller-sized LLMs for data generation instead of resource-intensive larger models. |
Kelvin Han; Claire Gardent; | naacl | 2025-05-04 |
| 844 | SUNAR: Semantic Uncertainty Based Neighborhood Aware Retrieval for Complex QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce SUNAR, a novel approach that leverages LLMs to guide a Neighborhood Aware Retrieval process. |
Venktesh V; Mandeep Rathee; Avishek Anand; | naacl | 2025-05-04 |
| 845 | Hybrid Graphs for Table-and-Text Based Question Answering Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a novel Hybrid Graph-based approach for Table-Text QA that leverages LLMs without fine-tuning. |
Ankush Agarwal; Chaitanya Devaguptapu; Ganesh S; | naacl | 2025-05-04 |
| 846 | Interaction Configurations and Prompt Guidance in Conversational AI for Question Answering in Human-AI Teams Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Extending from our initialformative study, which revealed challenges in human utilization ofconversational AI support, we designed two configurations for prompt guidance:a Nudging approach, where the AI suggests potential responses for human agents,and a Highlight strategy, emphasizing crucial parts of reference documents toaid human responses. |
JAEYOON SONG et. al. | arxiv-cs.HC | 2025-05-02 |
| 847 | Exp-VQA: Fine-grained Facial Expression Analysis Via Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yujian Yuan; Jiabei Zeng; Shiguang Shan; | Pattern Recognit. | 2025-05-01 |
| 848 | Retrieval Augmented Generation-driven Information Retrieval and Question Answering in Construction Management IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
CHENGKE WU et. al. | Adv. Eng. Informatics | 2025-05-01 |
| 849 | Augmenting General-purpose Large-language Models with Domain-specific Multimodal Knowledge Graph for Question-answering in Construction Project Management IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SHENGHUA ZHOU et. al. | Adv. Eng. Informatics | 2025-05-01 |
| 850 | A Resilient Generative Model in Few-shot Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Anqi Zou; Yanping Chen; Ruizhang Huang; Yongbin Qin; | Knowl. Based Syst. | 2025-05-01 |
| 851 | Simulating Question-answering Correctness with A Conditional Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a method called Diffusion-based Simulator (DSim), which takes advantage of diffusion to alleviate the bias accumulation. |
Ting Long; Li’ang Yin; Yi Chang; Wei Xia; Yong Yu; | www | 2025-04-30 |
| 852 | ConSens: Assessing Context Grounding in Open-book Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing evaluation methods, primarily based on the LLM-as-a-judge approach, face significant limitations, including biases, scalability issues, and dependence on costly external systems. To address these challenges, we propose a novel metric that contrasts the perplexity of the model response under two conditions: when the context is provided and when it is not. |
Ivan Vankov; Matyo Ivanov; Adriana Correia; Victor Botev; | arxiv-cs.CL | 2025-04-30 |
| 853 | Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, existing medical RAG systems suffer from two key limitations: (1) a lack of modeling for human-like reasoning behaviors during information retrieval, and (2) reliance on suboptimal medical corpora, which often results in the retrieval of irrelevant or noisy snippets. To overcome these challenges, we propose Discuss-RAG, a plug-and-play module designed to enhance the medical QA RAG system through collaborative agent-based reasoning. |
XUANZHAO DONG et. al. | arxiv-cs.CL | 2025-04-29 |
| 854 | Optimizing Answer Generator in Vietnamese Legal Question Answering Systems Using Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The development of large language models (LLMs) such as ChatGPT and Gemini has led to impressive advancements in question answering (QA) systems. However, they often rely on … |
Huong Le; Ngoc Luu; Thanh Nguyen; Tuan Dao; Sang Dinh; | ACM Transactions on Asian and Low-Resource Language … | 2025-04-29 |
| 855 | Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For domain-specific tasks, it is not clear if teacher or student model, or both, must be considered for domain adaptation. In this work, we study this problem from perspective of telecom domain Question-Answering (QA) task. |
Rishika Sen; Sujoy Roychowdhury; Sumit Soman; H. G. Ranjani; Srikhetra Mohanty; | arxiv-cs.CL | 2025-04-28 |
| 856 | VideoMultiAgents: A Multi-Agent Framework for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, many existing methods rely on feeding frame-level captions into a single model, making it difficult to adequately capture temporal and interactive contexts. To address this limitation, we introduce VideoMultiAgents, a framework that integrates specialized agents for vision, scene graph analysis, and text processing. |
NORIYUKI KUGO et. al. | arxiv-cs.CV | 2025-04-25 |
| 857 | FinBERT-QA: Financial Question Answering with Pre-trained BERT Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Motivated by the emerging demand in the financial industry for the automatic analysis of unstructured and structured data at scale, Question Answering (QA) systems can provide … |
Bithiah Yuan; | ArXiv | 2025-04-24 |
| 858 | A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite substantial progress, no comprehensive survey currently exists that systematically organizes and reviews the existing KB-VQA methods. This survey aims to fill this gap by establishing a structured taxonomy of KB-VQA approaches, and categorizing the systems into main stages: knowledge representation, knowledge retrieval, and knowledge reasoning. |
Jiaqi Deng; Zonghan Wu; Huan Huo; Guandong Xu; | arxiv-cs.CV | 2025-04-24 |
| 859 | TraveLLaMA: Facilitating Multi-modal Large Language Models to Understand Urban Scenes and Provide Travel Assistance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present TraveLLaMA, a specialized multimodal language model designed for urban scene understanding and travel assistance. |
MENG CHU et. al. | arxiv-cs.CV | 2025-04-23 |
| 860 | Credible Plan-Driven RAG Method for Multi-Hop Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: ExistingRAG methods often suffer from deviations in reasoning paths and cumulativeerrors in intermediate steps, reducing the fidelity of the final answer. Toaddress these limitations, we propose PAR-RAG (Plan-then-Act-and-Review RAG), anovel framework inspired by the PDCA (Plan-Do-Check-Act) cycle, to enhance boththe accuracy and factual consistency in multi-hop question answering.Specifically, PAR-RAG selects exemplars matched by the semantic complexity ofthe current question to guide complexity-aware top-down planning, resulting inmore precise and coherent multi-step reasoning trajectories. |
NINGNING ZHANG et. al. | arxiv-cs.CL | 2025-04-23 |
| 861 | FinDER: Financial Dataset for Question Answering and Evaluating Retrieval-Augmented Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By challenging models to retrieve relevantinformation from large corpora rather than relying on readily determinedcontexts, FinDER offers a more realistic benchmark for evaluating RAG systems.We further present a comprehensive evaluation of multiple state-of-the-artretrieval models and Large Language Models, showcasing challenges derived froma realistic benchmark to drive future research on truthful and precise RAG inthe financial domain. |
CHANYEOL CHOI et. al. | arxiv-cs.IR | 2025-04-22 |
| 862 | Efficient Document Retrieval with G-Retriever Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose an enhanced approach that replaces the PCST method with an attention-based sub-graph construction technique, enabling more efficient and context-aware retrieval. |
Manthankumar Solanki; | arxiv-cs.LG | 2025-04-21 |
| 863 | LLM-KGMQA: Large Language Model-augmented Multi-hop Question-answering System Based on Knowledge Graph in Medical Field Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
FEILONG WANG et. al. | Knowl. Inf. Syst. | 2025-04-21 |
| 864 | A Hierarchical Framework for Measuring Scientific Paper Innovation Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose HSPIM, a hierarchical and training-free framework based on large language models (LLMs). |
Hongming Tan; Shaoxiong Zhan; Fengwei Jia; Hai-Tao Zheng; Wai Kin Chan; | arxiv-cs.CL | 2025-04-20 |
| 865 | Long-context Non-factoid Question Answering in Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This study explores context-shortening techniques, including Open Information Extraction (OIE), coreference resolution, Answer Paragraph Selection (APS), and their combinations, to improve QA performance. |
Ritwik Mishra; Rajiv Ratn Shah; Ponnurangam Kumaraguru; | arxiv-cs.CL | 2025-04-18 |
| 866 | WebGLM: Towards An Efficient and Reliable Web-Enhanced Question-Answering System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We present WebGLM, an enhanced Large Language Model (LLM)-based retrieval question-answering system based on the ChatGLM3-6B, offering significant improvements over previous … |
HANYU LAI et. al. | ACM Transactions on Information Systems | 2025-04-18 |
| 867 | LLM-as-a-Judge: Reassessing The Performance of LLMs in Extractive QA IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we reassess the performance of QA models using LLM-as-a-judge across four reading comprehension QA datasets. |
Xanh Ho; Jiahao Huang; Florian Boudin; Akiko Aizawa; | arxiv-cs.CL | 2025-04-16 |
| 868 | Mitigating LLM Hallucinations with Knowledge Graphs: A Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This research paper provides learning outcomes from a case study with LinkQ, an open-source natural language interface that was developed to combat hallucinations by forcing an LLM to query a knowledge graph (KG) for ground-truth data during question-answering (QA). |
Harry Li; Gabriel Appleby; Kenneth Alperin; Steven R Gomez; Ashley Suh; | arxiv-cs.HC | 2025-04-16 |
| 869 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces MQAD, a music QA dataset built on the Million Song Dataset (MSD), encompassing a rich array of musical features – including beat, chord, key, structure, instrument, and genre — across 270,000 tracks, featuring nearly 3 million diverse questions and captions. |
Z. OUYANG et. al. | icassp | 2025-04-15 |
| 870 | AskQE: Question Answering As Automatic Evaluation for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: How can a monolingual English speaker determine whether an automatictranslation in French is good enough to be shared? Existing MT error detectionand quality estimation (QE) techniques do not address this practical scenario.We introduce AskQE, a question generation and answering framework designed todetect critical MT errors and provide actionable feedback, helping users decidewhether to accept or reject MT outputs even without the knowledge of the targetlanguage. |
Dayeon Ki; Kevin Duh; Marine Carpuat; | arxiv-cs.CL | 2025-04-15 |
| 871 | Constraint-Awareness and Graph Reasoning for Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thirdly, a gap exists between the vector spaces of KG embeddings and question representations, and a simplistic hard concatenation or fusion of these two can lead to suboptimal solutions. To address these shortcomings, we propose a temporally aware QA method named Constraint-Awareness and Graph Reasoning. |
Z. Sun; K. Zhang; X. Zhang; J. Liu; | icassp | 2025-04-15 |
| 872 | Exploring The Role of Knowledge Graph-Based RAG in Japanese Medical Question Answering with Small-Scale LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large language models (LLMs) perform well in medical QA, but their effectiveness in Japanese contexts is limited due to privacy constraints that prevent the use of commercial models like GPT-4 in clinical settings. |
YINGJIAN CHEN et. al. | arxiv-cs.CL | 2025-04-15 |
| 873 | Ai2 Scholar QA: Organized Literature Synthesis with Attribution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Ai2 Scholar QA, a free onlinescientific question answering application. |
AMANPREET SINGH et. al. | arxiv-cs.CL | 2025-04-15 |
| 874 | Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a text-to-speech (TTS) system powered by a fine-tuned Llama model, named TTS-Llama, that achieves state-of-the-art speech synthesis performance. |
M. Shen; | icassp | 2025-04-15 |
| 875 | A Hierarchical Reasoning Framework for Complex Question Answering Over Knowledge Graph with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enhance model interpretability, reinforcement learning based methods are introduced. |
Z. Zhang; Z. Zhang; Y. Zhang; W. Zhao; | icassp | 2025-04-15 |
| 876 | Electrocardiogram Report Generation and Question Answering Via Retrieval-Augmented Self-Supervised Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Interpreting electrocardiograms (ECGs) and generating comprehensive reports remain challenging tasks in cardiology, often requiring specialized expertise and significant time investment. To address these critical issues, we propose ECG-ReGen, a retrieval-based approach for ECG-to-text report generation and question answering. |
J. Tang; T. Xia; Y. Lu; C. Mascolo; A. Saeed; | icassp | 2025-04-15 |
| 877 | Visual Entity-Centric Prompting for Knowledge Retrieval in Knowledge-based VQA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a visual entity-centric prompting for knowledge retrieval (VEPR) to bridge the gap between the implicit and explicit knowledge driven by visual entities via large language models. |
J. YANG et. al. | icassp | 2025-04-15 |
| 878 | Bridging Neural and Symbolic Reasoning: A Dual-System Framework for Interpretable Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, LLMs often lack transparency in their reasoning processes and struggle with hallucination. To overcome these challenges, we propose Dual-NeSy, a Dual-system framework that integrates Neural networks with Symbolic logic for interpretable question answering. |
J. Shi; X. Ding; H. Zhao; T. Liu; B. Qin; | icassp | 2025-04-15 |
| 879 | Leveraging Chain of Thought Towards Empathetic Spoken Dialogue Without Corresponding Question-Answering Data IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, we propose a novel approach that circumvents the need for question-answering data, called Listen, Perceive, and Express (LPE). |
J. Xie; | icassp | 2025-04-15 |
| 880 | Speech Retrieval-Augmented Generation Without Automatic Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While this cascaded pipeline has proven effective in many practical settings, ASR errors can propagate to the retrieval and generation steps. To overcome this limitation, we introduce SpeechRAG, a novel framework designed for open-question answering over spoken data. |
D. J. MIN et. al. | icassp | 2025-04-15 |
| 881 | Audiopedia: Audio QA with Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce Audiopedia, a novel task called Audio Question Answering with Knowledge, which requires both audio comprehension and external knowledge reasoning. |
A. S. Penamakuri; K. Chhatre; A. Jain; | icassp | 2025-04-15 |
| 882 | A Continual Learning Approach for Embodied Question Answering with Generative Adversarial Imitation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we proposed a continual learning method based on generative adversarial imitation learning and self-supervision to support the agent when facing unseen environments. |
X. ZENG et. al. | icassp | 2025-04-15 |
| 883 | SiQA: A Large Multi-Modal Question Answering Model for Structured Images Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose SiQA, a knowledge construction and Retrieval-Augmented Generation(RAG)-based multimodal Question-Answering model designed for Structured Images. |
J. Liu; Y. Tao; F. Wang; H. Li; X. Qin; | icassp | 2025-04-15 |
| 884 | Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a three-stage (pre-training, fine-tuning and re-ranking) framework for legal QA (called PFR-LQA), which promotes the fine-grained text representation learning and boosts the performance of dense retrieval with the dual-encoder architecture. |
S. Ni; H. Cheng; M. Yang; | icassp | 2025-04-15 |
| 885 | Composable NLP Workflows for BERT-based Ranking and QA System Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we built an end-to-end Ranking and Question-Answering (QA) system using Forte, a toolkit that makes composable NLP pipelines. |
Gaurav Kumar; Murali Mohana Krishna Dandu; | arxiv-cs.CL | 2025-04-12 |
| 886 | VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods for Multimodal Multi-hop Question Answering (MMQA) often suffer from limited reasoning capabilities, reliance on modality conversion, and inadequate alignment between visual and textual representations. To address these limitations, this paper introduces Vision-Language Multimodal Transformer (VLMT), a unified architecture that integrates a transformer-based vision encoder with a sequence-to-sequence language model. |
Qi Zhi Lim; Chin Poo Lee; Kian Ming Lim; Kalaiarasi Sonai Muthu Anbananthen; | arxiv-cs.CV | 2025-04-11 |
| 887 | Multimodal Commonsense Knowledge Distillation for Visual Question Answering (Student Abstract) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Existing Multimodal Large Language Models (MLLMs) and Visual Language Pretrained Models (VLPMs) have shown remarkable performances in general Visual Question Answering (VQA). … |
Shuo Yang; Siwen Luo; S. Han; | AAAI Conference on Artificial Intelligence | 2025-04-11 |
| 888 | Knowledge Graph-extended Retrieval Augmented Generation for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Ideally, an AI system should be both robust to missing facts as well as easy to communicate with. This paper proposes such a system that integrates LLMs and KGs without requiring training, ensuring adaptability across different KGs with minimal human effort. |
Jasper Linders; Jakub M. Tomczak; | arxiv-cs.LG | 2025-04-11 |
| 889 | TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Tool-Augmented LLM Evaluation (TALE), a frameworkto assess LLM outputs without predetermined ground-truth answers. |
Sher Badshah; Ali Emami; Hassan Sajjad; | arxiv-cs.CL | 2025-04-09 |
| 890 | Visual Question Answering: A Survey of Methods, Datasets, Evaluation, and Challenges IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Visual question answering (VQA) is a dynamic field of research that aims to generate textual answers from given visual and question information. It is a multimodal field that has … |
Byeong Su Kim; Jieun Kim; Deokwoo Lee; Beakcheol Jang; | ACM Computing Surveys | 2025-04-08 |
| 891 | Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a novel Small Language Model(SLM)-driven system that synergizes advancements in lightweight Retrieval-Augmented Generation (RAG) and semantic-aware data structuring to enable efficient, accurate, and scalable query resolution across diverse data formats. |
Teng Lin; | arxiv-cs.DB | 2025-04-07 |
| 892 | REVEAL: Relation-based Video Representation Learning for Video-Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Video-Question-Answering (VideoQA) comprises the capturing of complex visual relation changes over time, remaining a challenge even for advanced Video Language Models (VLM), i.a., because of the need to represent the visual content to a reasonably sized input for those models. To address this problem, we propose RElation-based Video rEpresentAtion Learning (REVEAL), a framework designed to capture visual relation information by encoding them into structured, decomposed representations. |
Sofian Chaybouti; Walid Bousselham; Moritz Wolter; Hilde Kuehne; | arxiv-cs.CV | 2025-04-07 |
| 893 | Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering Via White-Box and Black-Box LLM Collaboration IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Collab-RAG, a collaborative training framework that leverages mutual enhancement between a white-box small language model (SLM) and a blackbox large language model (LLM) for RAG. |
RAN XU et. al. | arxiv-cs.CL | 2025-04-07 |
| 894 | Advancing Egocentric Video Question Answering with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce QaEgo4Dv2 to mitigate annotation noise in QaEgo4D, enabling more reliable comparison. |
Alkesh Patel; Vibhav Chitalia; Yinfei Yang; | arxiv-cs.CV | 2025-04-06 |
| 895 | Enabling Collaborative Parametric Knowledge Calibration for Retrieval-Augmented Vision Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Tofully exploit the cross-task synergy in KB-VQA, we propose a unifiedretrieval-augmented VQA framework with collaborative parametric knowledgecalibration. |
JIAQI DENG et. al. | arxiv-cs.CV | 2025-04-05 |
| 896 | QIRL: Boosting Visual Question Answering Via Optimized Question-Image Relation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Second, they do not assess the relevance between the input question and image during inference, as no prior work has examined the degree of input relevance in debiasing studies. Motivated by these limitations, we propose a novel framework, Optimized Question-Image Relation Learning (QIRL), which employs a generation-based self-supervised learning strategy. |
QUANXING XU et. al. | arxiv-cs.CV | 2025-04-04 |
| 897 | Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: (2) Excessive reliance on implicit learning in Transformer-based cross-modal self-attention fusion methods, which obscures crucial local semantic correlations in medical scenarios. To address these issues, this study proposes a HiCA-VQA method, including two modules: Hierarchical Prompting for fine-grained medical questions and Hierarchical Answer Decoders. |
Junkai Zhang; Bin Li; Shoujun Zhou; Yue Du; | arxiv-cs.CV | 2025-04-03 |
| 898 | Single-Pass Document Scanning for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a single-pass document scanning approach thatprocesses the entire text in linear time, preserving global coherence whiledeciding which sentences are most relevant to the query. |
WEILI CAO et. al. | arxiv-cs.CL | 2025-04-03 |
| 899 | Leveraging Static Relationships for Intra-Type and Inter-Type Message Passing in Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although methods based on static relationship reasoning have made certain progress, there are still deficiencies in the accuracy of static relationship recognition and representation, and they have not fully utilized the static relationship information in videos for in-depth reasoning and analysis. Therefore, this paper proposes a reasoning method for intra-type and inter-type message passing based on static relationships. |
Lili Liang; Guanglu Sun; | arxiv-cs.CV | 2025-04-03 |
| 900 | GeoRAG: A Question-Answering Approach from A Geographical Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study presents GeoRAG, a knowledge-enhanced QA framework integrating domain-specific fine-tuning and prompt engineering with Retrieval-Augmented Generation (RAG) technology to enhance geographic knowledge retrieval accuracy and user interaction. |
JIAN WANG et. al. | arxiv-cs.IR | 2025-04-02 |
| 901 | DSAF: A Dual-Stage Attention Based Multimodal Fusion Framework for Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
K. Mukesh; S. L. Jayaprakash; R. Prasanna Kumar; | SN Computer Science | 2025-04-01 |
| 902 | Weakly-supervised Explainable Question Answering Via Question Aware Contrastive Learning and Adaptive Gate Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YUE FAN et. al. | Inf. Sci. | 2025-04-01 |
| 903 | Chart Question Answering with Multimodal Graph Representation Learning and Zero-shot Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
A. Farahani; Peyman Adibi; Sayyed Mohammad Saeed Ehsani; Hans-Peter Hutter; Alireza Darvishy; | Expert Syst. Appl. | 2025-04-01 |
| 904 | Seeing and Reasoning: A Simple Deep Learning Approach to Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
R. Zakari; Jim Wilson Owusu; Ke Qin; Tao He; Guangchun Luo; | Big Data Min. Anal. | 2025-04-01 |
| 905 | Bayesian-error-informed Contrastive Learning for Knowledge-based Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Sudarshan Yerragunta; R. Prasath; G. Girish; | Comput. Electr. Eng. | 2025-04-01 |
| 906 | RBTM: A Hybrid Gradient Regression-Based Transformer Model for Biomedical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Suneetha Vazrala; Thayyaba Khatoon Mohammed; | Biomed. Signal Process. Control. | 2025-04-01 |
| 907 | Complex Knowledge Base Question Answering with Difficulty-aware Active Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
DONG WANG et. al. | Expert Syst. Appl. | 2025-04-01 |
| 908 | Chatbot Dialog Design for Improved Human Performance in Domain Knowledge Discovery Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The advent of machine learning (ML) has led to the widespread adoption of developing task-oriented dialog systems for scientific applications (e.g., science gateways) where … |
ROLAND ORUCHE et. al. | IEEE Transactions on Human-Machine Systems | 2025-04-01 |
| 909 | Biomedical Question Answering Via Multi-Level Summarization on A Local Knowledge Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel method that utilizes propositional claims to construct a local knowledge graph from retrieved documents. |
Lingxiao Guan; Yuanhao Huang; Jie Liu; | arxiv-cs.CL | 2025-04-01 |
| 910 | Are You Really Listening? Boosting Perceptual Awareness in Music-QA Benchmarks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These findings suggest existing benchmarks predominantly assess reasoning abilities rather than audio perception. To overcome this challenge, we present RUListening: Robust Understanding through Listening, a framework that enhances perceptual evaluation in Music-QA benchmarks. |
Yongyi Zang; Sean O’Brien; Taylor Berg-Kirkpatrick; Julian McAuley; Zachary Novack; | arxiv-cs.SD | 2025-03-31 |
| 911 | Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Moreover, in MCQA tasks, the absence of relevant KG knowledge for certain answer options remains a significant challenge. To address these issues, we propose Question-Aware Knowledge Graph Prompting (QAP), which incorporates question embeddings into GNN aggregation to dynamically assess KG relevance. |
Haochen Liu; Song Wang; Chen Chen; Jundong Li; | arxiv-cs.CL | 2025-03-30 |
| 912 | Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation. |
Yuelyu Ji; Rui Meng; Zhuochun Li; Daqing He; | arxiv-cs.CL | 2025-03-29 |
| 913 | FReM: A Flexible Reasoning Mechanism for Balancing Quick and Slow Thinking in Long-Context Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Quick thinking usually relies on pattern matching rather than truly understanding the query logic, which misses proper understanding. To address these issues, we propose FReM: Flexible Reasoning Mechanism, a method that adjusts reasoning depth according to the complexity of each question. |
ZHENGYI ZHAO et. al. | arxiv-cs.CL | 2025-03-29 |
| 914 | Can DeepSeek Reason Like A Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate the dialogue capabilities of the DeepSeek model in robotic surgery scenarios, focusing on tasks such as Single Phrase QA, Visual QA, and Detailed Description. |
BOYI MA et. al. | arxiv-cs.CV | 2025-03-29 |
| 915 | Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This work presents PRAISE, a pipeline-based approach for ConvQA that trains LLM adapters for each of the three subtasks. |
Magdalena Kaiser; Gerhard Weikum; | arxiv-cs.CL | 2025-03-28 |
| 916 | AskSport: Web Application for Sports Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces AskSport, a question-answering web application about sports. |
Enzo B Onofre; Leonardo M P Moraes; Cristina D Aguiar; | arxiv-cs.AI | 2025-03-26 |
| 917 | DomainCQA: Crafting Expert-Level QA from Domain-Specific Charts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Chart Question Answering (CQA) benchmarks are essential for evaluating the capability of Multimodal Large Language Models (MLLMs) to interpret visual data. However, current … |
LING ZHONG et. al. | ArXiv | 2025-03-25 |
| 918 | DomainCQA: Crafting Knowledge-Intensive QA from Domain-Specific Charts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Chart Question Answering (CQA) evaluates Multimodal Large Language Models(MLLMs) on visual understanding and reasoning over chart data. |
YUJING LU et. al. | arxiv-cs.CL | 2025-03-25 |
| 919 | VGAT: A Cancer Survival Analysis Framework Transitioning from Generative Visual Question Answering to Genomic Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To enablesurvival prediction using only whole-slide images (WSI), we propose theVisual-Genomic Answering-Guided Transformer (VGAT), a framework integratingVisual Question Answering (VQA) techniques for genomic modality reconstruction.By adapting VQA’s text feature extraction approach, we derive stable genomicrepresentations that circumvent dimensionality challenges in raw genomic data.Simultaneously, a cluster-based visual prompt module selectively enhancesdiscriminative WSI patches, addressing noise from unfiltered image regions.Evaluated across five TCGA datasets, VGAT outperforms existing WSI-onlymethods, demonstrating the viability of genomic-informed inference withoutsequencing. |
ZIZHI CHEN et. al. | arxiv-cs.CV | 2025-03-25 |
| 920 | Improved Alignment of Modalities in Large Vision Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose four training stages for aligning the vision model with the language model, in other words, the language model is given an ability to process visual inputs. |
Kartik Jangra; Aman Kumar Singh; Yashwani Mann; Geetanjali Rathee; | arxiv-cs.CV | 2025-03-25 |
| 921 | A Survey of Large Language Model Agents for Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper surveys the development of large language model (LLM)-based agents for question answering (QA). |
Murong Yue; | arxiv-cs.CL | 2025-03-24 |
| 922 | MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Visual Question Answering (VQA) requires reasoning across visual and textual modalities, yet Large Vision-Language Models (LVLMs) often lack integrated commonsense knowledge, limiting their robustness in real-world scenarios. To address this, we introduce MAGIC-VQA, a novel framework that enhances VQA by systematically integrating commonsense knowledge with LVLMs. |
Shuo Yang; Siwen Luo; Soyeon Caren Han; Eduard Hovy; | arxiv-cs.CL | 2025-03-24 |
| 923 | SUNAR: Semantic Uncertainty Based Neighborhood Aware Retrieval for Complex QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce SUNAR, a novel approach that leverages LLMs to guide a Neighborhood Aware Retrieval process. |
V Venktesh; Mandeep Rathee; Avishek Anand; | arxiv-cs.IR | 2025-03-23 |
| 924 | Conversational Open-domain Question Answering for Resource-constrained Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Emrah Budur; Tunga Güngör; | Turkish J. Electr. Eng. Comput. Sci. | 2025-03-21 |
| 925 | Joint Extraction Matters: Prompt-Based Visual Question Answering for Multi-Field Document Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the merits of extracting multiple fields jointly versus separately. |
Mengsay Loem; Taiju Hosaka; | arxiv-cs.CL | 2025-03-21 |
| 926 | MTBench: A Multimodal Time Series Benchmark for Temporal Reasoning and Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While multimodal learning has gained traction, existing multimodal time-series datasets fall short in evaluating cross-modal reasoning and complex question answering, which are essential for capturing complex interactions between narrative information and temporal patterns. To bridge this gap, we introduce Multimodal Time Series Benchmark (MTBench), a large-scale benchmark designed to evaluate large language models (LLMs) on time series and text understanding across financial and weather domains. |
JIALIN CHEN et. al. | arxiv-cs.CL | 2025-03-21 |
| 927 | Agentic Keyframe Search for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address it, we propose Agentic Keyframe Search (AKeyS), a simple yet powerful algorithm for identifying keyframes in the VideoQA task. |
Sunqi Fan; Meng-Hao Guo; Shuojin Yang; | arxiv-cs.CV | 2025-03-20 |
| 928 | MKG-Rank: Enhancing Large Language Models with Knowledge Graph for Multilingual Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Language Models (LLMs) have shown remarkable progress in medical question answering (QA), yet their effectiveness remains predominantly limited to English due to imbalanced multilingual training data and scarce medical resources for low-resource languages. To address this critical language gap in medical QA, we propose Multilingual Knowledge Graph-based Retrieval Ranking (MKG-Rank), a knowledge graph-enhanced framework that enables English-centric LLMs to perform multilingual medical QA. |
FEIYANG LI et. al. | arxiv-cs.CL | 2025-03-20 |
| 929 | Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce PKR-QA (Procedural Knowledge Reasoning Question Answering), anew benchmark for question answering over procedural tasks that requirestructured reasoning. |
THANH-SON NGUYEN et. al. | arxiv-cs.CV | 2025-03-19 |
| 930 | Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study systematically evaluates demographic biases within medical RAG pipelines across multiple QA benchmarks, including MedQA, MedMCQA, MMLU, and EquityMedQA. |
Yuelyu Ji; Hang Zhang; Yanshan Wang; | arxiv-cs.CL | 2025-03-19 |
| 931 | Right Answer, Wrong Score: Uncovering The Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we shed light on the inconsistencies of MCQA evaluation strategies, which can lead to inaccurate and misleading model comparisons. |
FRANCESCO MARIA MOLFESE et. al. | arxiv-cs.CL | 2025-03-19 |
| 932 | Synthetic Clarification and Correction Dialogues About Data-Centric Tasks – A Teacher-Student Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Real dialogues with AI assistants for solving data-centric tasks often follow dynamic, unpredictable paths due to imperfect information provided by the user or in the data, which … |
Christian Poelitz; Nick McKenna; | ArXiv | 2025-03-18 |
| 933 | Synthetic Clarification and Correction Dialogues About Data-Centric Tasks — A Teacher-Student Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop a novel framework for synthetically generating controlled, multi-turn conversations between a user and AI assistant for the task of table-based question answering, which can be generated from an existing dataset with fully specified table QA examples for any target domain. |
Christian Poelitz; Nick McKenna; | arxiv-cs.CL | 2025-03-18 |
| 934 | EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, the application of large multi-modal models in IAD remains in its infancy, facing challenges in balancing question-answering (QA) performance and mask-based grounding capabilities, often owing to overfitting during the fine-tuning process. To address these challenges, we propose a novel approach that introduces a dedicated multi-modal defect localization module to decouple the dialog functionality from the core feature extraction. |
Zongyun Zhang; Jiacheng Ruan; Xian Gao; Ting Liu; Yuzhuo Fu; | arxiv-cs.AI | 2025-03-18 |
| 935 | Elevating Visual Question Answering Through Implicitly Learned Reasoning Pathways in LVLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Large Vision-Language Models (LVLMs) have shown remarkable progress in various multimodal tasks, yet they often struggle with complex visual reasoning that requires multi-step inference. To address this limitation, we propose MF-SQ-LLaVA, a novel approach that enhances LVLMs by enabling implicit self-questioning through end-to-end training. |
Liu Jing; Amirul Rahman; | arxiv-cs.CV | 2025-03-18 |
| 936 | VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce VideoMind, a novel video-language agent designed for temporal-grounded video understanding. |
Ye Liu; Kevin Qinghong Lin; Chang Wen Chen; Mike Zheng Shou; | arxiv-cs.CV | 2025-03-17 |
| 937 | Chain-of-Action: Faithful and Multimodal Question Answering Through Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented Question-Answering (QA). |
Zhenyu Pan; Haozheng Luo; Manling Li; Han Liu; | iclr | 2025-03-17 |
| 938 | Generalization V.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To effectively capture task-specific pretraining data frequency, we propose a novel task-gram language model, which is built by counting the co-occurrence of semantically related $n$-gram pairs from task inputs and outputs in the pretraining corpus. |
XINYI WANG et. al. | iclr | 2025-03-17 |
| 939 | Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Towards a solution, we introduce MIRAGE (Multi-Image Retrieval Augmented Generation), an open-source, lightweight visual-RAG framework that processes up to 10k images on a single 40G A100 GPU—far surpassing the 1k-image limit of contemporary models. |
TSUNG-HAN WU et. al. | iclr | 2025-03-17 |
| 940 | ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a result, we present *ClimaQA-Gold*, an expert-annotated benchmark dataset alongside *ClimaQA-Silver*, a large-scale, comprehensive synthetic QA dataset for climate science. |
VEERAMAKALI VIGNESH MANIVANNAN et. al. | iclr | 2025-03-17 |
| 941 | CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Due to the limitations of MCQ evaluations and the advanced reasoning abilities of MLLMs, models can often answer correctly by combining short video insights with elimination, without truly understanding the content. To bridge this gap, we introduce CG-Bench, a benchmark for clue-grounded question answering in long videos. |
GUO CHEN et. al. | iclr | 2025-03-17 |
| 942 | SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current benchmarks for video understanding typically emphasize isolated single-instance text inputs and fail to evaluate the capacity to sustain temporal reasoning throughout the entire duration of video streams. To address these limitations, we introduce SVBench, a pioneering benchmark with temporal multi-turn question-answering chains specifically designed to thoroughly assess the capabilities of streaming video understanding of current LVLMs. |
ZHENYU YANG et. al. | iclr | 2025-03-17 |
| 943 | Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These challenges often arise due to the complexity and ambiguity present in longer texts. To enhance the performance of LLMs in such scenarios, we introduce the Long Question Coreference Adaptation (LQCA) method. |
YANMING LIU et. al. | iclr | 2025-03-17 |
| 944 | MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we study whether MLLMs can perceive small visual details as effectively as large ones when answering questions about images. |
Jiarui Zhang; Mahyar Khayatkhoei; Prateek Chhikara; Filip Ilievski; | iclr | 2025-03-17 |
| 945 | Streaming Video Question-Answering with In-context Video KV-Cache Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose ReKV, a novel training-free approach that enables efficient streaming video question-answering (StreamingVQA), by seamlessly integrating with existing Video Large Language Models (Video-LLMs). |
SHANGZHE DI et. al. | iclr | 2025-03-17 |
| 946 | CofCA: A STEP-WISE Counterfactual Multi-hop QA Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, current factual Multi-hop QA (MHQA) benchmarks are annotated on open-source corpora such as Wikipedia, although useful for multi-step reasoning evaluation, they show limitations due to the potential data contamination in LLMs’ pre-training stage. To address these issues, we introduce the Step-wise and Counterfactual benchmark (CofCA), a novel evaluation benchmark consisting of factual data and counterfactual data that reveals LLMs’ real reasoning abilities on multi-step reasoning and reasoning chain evaluation. |
Jian Wu; Linyi Yang; Zhen Wang; Manabu Okumura; Yue Zhang; | iclr | 2025-03-17 |
| 947 | Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a new multi-shot video understanding benchmark \dataset with detailed shot-level captions, comprehensive video summaries and question-answering pairs. |
Mingfei Han; Linjie Yang; Xiaojun Chang; Lina Yao; Heng Wang; | iclr | 2025-03-17 |
| 948 | QA-Calibration of Language Model Confidence Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We argue, however, that this standard (average-case) notion of calibration is difficult to interpret for decision-making in generative QA. To address this, we generalize the standard notion of average calibration and introduce QA-calibration, which ensures calibration holds across different question-and-answer groups. |
Putra Manggala; Atalanti A. Mastakouri; Elke Kirschbaum; Shiva Kasiviswanathan; Aaditya Ramdas; | iclr | 2025-03-17 |
| 949 | GFSNet: Gaussian Fourier with Sparse Attention Network for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xiang Shen; Dezhi Han; C. Chang; Ammar Oad; Huafeng Wu; | Artif. Intell. Rev. | 2025-03-15 |
| 950 | Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This typically occurs because these models tend toprioritize self-generated content over the input context, causing them todisregard pertinent details. To address this challenge, we introduce a novelmethod called Guided Attention Map Editing (GAME), which dynamically adjustsattention maps to improve contextual relevance. |
YU WANG et. al. | arxiv-cs.CL | 2025-03-11 |
| 951 | ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces ReAgent: a Reversible multi-Agent collaborative framework augmented with explicit backtracking mechanisms, enabling reversible multi-hop reasoning. |
XINJIE ZHAO et. al. | arxiv-cs.AI | 2025-03-10 |
| 952 | MapQA: Open-domain Geospatial Question Answering on Map Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A major challenge in scaling geospatial QA datasets for reasoning lies in the complexity of geospatial relationships, which require integrating spatial structures, topological dependencies, and multi-hop reasoning capabilities that most text-based QA datasets lack. To address these limitations, we introduce MapQA, a novel dataset that not only provides question-answer pairs but also includes the geometries of geo-entities referenced in the questions. |
Zekun Li; Malcolm Grossman; Mihir Kulkarni; Muhao Chen; Yao-Yi Chiang; | arxiv-cs.CL | 2025-03-10 |
| 953 | Talking to GDELT Through Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we study various Retrieval Augmented Regeneration (RAG)approaches to gain an understanding of the strengths and weaknesses of eachapproach in a question-answering analysis. |
AUDUN MYERS et. al. | arxiv-cs.IR | 2025-03-10 |
| 954 | VisualSimpleQA: A Benchmark for Decoupled Evaluation of Large Vision-Language Models in Fact-Seeking Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current multimodal fact-seeking benchmarks primarily focus on comparing model outputs to ground truth answers, providing limited insights into the performance of modality-specific modules. To bridge this gap, we introduce VisualSimpleQA, a multimodal fact-seeking benchmark with two key features. |
YANLING WANG et. al. | arxiv-cs.CL | 2025-03-09 |
| 955 | HCT-QA: A Benchmark for Question Answering on Human-Centric Tables Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Tabular data embedded within PDF files, web pages, and other document formats are prevalent across numerous sectors such as government, engineering, science, and business. These … |
M. S. AHMAD et. al. | ArXiv | 2025-03-09 |
| 956 | Towards Fine-Grained Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing datasets exhibit gaps in temporal and spatial granularity, which consequently limits the capabilities of existing VideoQA methods. This paper introduces the Multi-Object Multi-Actor Question Answering (MOMA-QA) dataset, which is designed to address these shortcomings by emphasizing temporal localization, spatial relationship reasoning, and entity-centric queries. |
WEI DAI et. al. | arxiv-cs.CV | 2025-03-09 |
| 957 | Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on The Enhanced Conformal Prediction Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes an enhanced CP framework for medical multiple-choice question-answering (MCQA) tasks. |
Yusong Ke; Hongru Lin; Yuting Ruan; Junya Tang; Li Li; | arxiv-cs.CL | 2025-03-07 |
| 958 | BPQA Dataset: Evaluating How Well Language Models Leverage Blood Pressures to Answer Biomedical Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: It is an important component of biomedical data, which can be used to train transformer-based language models (LMs) for improving healthcare delivery. |
CHI HANG et. al. | arxiv-cs.CL | 2025-03-06 |
| 959 | Evaluating Answer Reranking Strategies in Time-sensitive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate the impact of temporal characteristics of answers in Question Answering (QA) by exploring several simple answer selection techniques. |
Mehmet Kardan; Bhawna Piryani; Adam Jatowt; | arxiv-cs.CL | 2025-03-06 |
| 960 | Question-Aware Gaussian Experts for Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes QA-TIGER, a novel framework that explicitly incorporates question information and models continuous temporal dynamics. |
HONGYEOB KIM et. al. | arxiv-cs.CV | 2025-03-06 |
| 961 | EgoLife: Towards Egocentric Life Assistant Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: By releasing our datasets, models, and benchmarks, we aim to stimulate further research in egocentric AI assistants. |
JINGKANG YANG et. al. | arxiv-cs.CV | 2025-03-05 |
| 962 | EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel question-answering (QA) dataset using echocardiogram reports sourced from the Medical Information Mart for Intensive Care database. |
Lama Moukheiber; Mira Moukheiber; Dana Moukheiiber; Jae-Woo Ju; Hyung-Chul Lee; | arxiv-cs.AI | 2025-03-04 |
| 963 | Optimizing Open-domain Question Answering with Graph-based Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we benchmark various graph-based retrieval-augmented generation (RAG) systems across a broad spectrum of query types, including OLTP-style (fact-based) and OLAP-style (thematic) queries, to address the complex demands of open-domain question answering (QA). |
JOYCE CAHOON et. al. | arxiv-cs.IR | 2025-03-04 |
| 964 | Towards Robust Expert Finding in Community Question Answering Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces TUEF, a topic-oriented user-interaction model for fair Expert Finding in Community Question Answering (CQA) platforms. |
Maddalena Amendola; Andrea Passarella; Raffaele Perego; | arxiv-cs.IR | 2025-03-04 |
| 965 | OWLViz: An Open-World Benchmark for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a challenging benchmark for the Open WorLd VISual question answering (OWLViz) task. |
THUY NGUYEN et. al. | arxiv-cs.LG | 2025-03-04 |
| 966 | LLMs Performance in Answering Educational Questions in Brazilian Portuguese: A Preliminary Analysis on LLMs Potential to Support Diverse Educational Needs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Question-answering systems facilitate adaptive learning and respond to student queries, making education more responsive. Despite that, challenges such as natural language … |
LUIZ RODRIGUES et. al. | Proceedings of the 15th International Learning Analytics … | 2025-03-03 |
| 967 | Q-NL Verifier: Leveraging Synthetic Data for Robust Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Question answering (QA) requires accurately aligning user questions with structured queries, a process often limited by the scarcity of high-quality query-natural language (Q-NL) pairs. To overcome this, we present Q-NL Verifier, an approach to generating high-quality synthetic pairs of queries and NL translations. |
Tim Schwabe; Louisa Siebel; Patrik Valach; Maribel Acosta; | arxiv-cs.CL | 2025-03-03 |
| 968 | Beyond Prompting: An Efficient Embedding Framework for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thispaper, we propose EmbQA, an embedding-level framework that alleviates theseshortcomings by enhancing both the retriever and the reader. |
ZHANGHAO HU et. al. | arxiv-cs.CL | 2025-03-03 |
| 969 | SRAG: Structured Retrieval-Augmented Generation for Multi-Entity Question Answering Over Wikipedia Graph Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multi-entity question answering (MEQA) poses significant challenges for large language models (LLMs), which often struggle to consolidate scattered information across multiple … |
Teng Lin; Yizhang Zhu; Yuyu Luo; Nan Tang; | arxiv-cs.CL | 2025-03-03 |
| 970 | CooKie: Commonsense Knowledge-guided Mixture-of-experts Framework for Fine-grained Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chao Wang; Jianming Yang; Yang Zhou; Xiaodong Yue; | Inf. Sci. | 2025-03-01 |
| 971 | Bias-guided Margin Loss for Robust Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YANHAN SUN et. al. | Inf. Process. Manag. | 2025-03-01 |
| 972 | Handling Language Prior and Compositional Reasoning Issues in Visual Question Answering System IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Souvik Chowdhury; Badal Soni; | Neurocomputing | 2025-03-01 |
| 973 | Prompt Learning for Few-Shot Question Answering Via Self-Context Data Augmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Pretrained language models (PLMs) have shown remarkable performance on question answering (QA) tasks, but they usually require fine-tuning (FT) that depends on a substantial … |
Jian-Qiang Qiu; Chun-Yang Zhang; C. L. P. Chen; | IEEE Transactions on Artificial Intelligence | 2025-03-01 |
| 974 | Streaming Video Question-Answering with In-context Video KV-Cache Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose ReKV, a novel training-free approach that enables efficient streaming video question-answering (StreamingVQA), by seamlessly integrating with existing Video Large Language Models (Video-LLMs). |
SHANGZHE DI et. al. | arxiv-cs.CV | 2025-03-01 |
| 975 | AILS-NTUA at SemEval-2025 Task 8: Language-to-Code Prompting and Error Fixing for Tabular Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present our submission to SemEval-2025 Task 8: Question Answering over Tabular Data. |
Andreas Evangelatos; Giorgos Filandrianos; Maria Lymperaiou; Athanasios Voulodimos; Giorgos Stamou; | arxiv-cs.CL | 2025-03-01 |
| 976 | Collaborative Aware Bidirectional Semantic Reasoning for Video Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Video question answering (VideoQA) is the challenging task of accurately responding to natural language questions based on a given video. Most previous methods focus on designing … |
Xize Wu; Jiasong Wu; Lei Zhu; L. Senhadji; Huazhong Shu; | IEEE Transactions on Circuits and Systems for Video … | 2025-03-01 |
| 977 | Fine-grained Knowledge Fusion for Retrieval-augmented Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
XIAO LIANG et. al. | Inf. Fusion | 2025-03-01 |
| 978 | Cycle-VQA: A Cycle-Consistent Framework for Robust Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
LIN FAN et. al. | Pattern Recognit. | 2025-03-01 |
| 979 | PASemiQA: Plan-Assisted Agent for Question Answering on Semi-Structured Data with Text and Relational Information Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing RAG methods typically focus on a single type of external data, such as vectorized text database or knowledge graphs, and cannot well handle real-world questions on semi-structured data containing both text and relational information. To bridge this gap, we introduce PASemiQA, a novel approach that jointly leverages text and relational information in semi-structured data to answer questions. |
Hansi Yang; Qi Zhang; Wei Jiang; Jianguo Li; | arxiv-cs.CL | 2025-02-28 |
| 980 | Glimpse of MCQ Based VQA in Road & Traffic Scenarios Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multi-modal models have been a boon to the commu-nity over the last decade. The Large Language and Vision Models for Autonomous Driving (LLVM-AD) challenge [3] is a Visual … |
Athira Krishnan R; Sumukha Bg; Ambarish Parthasarathy; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-28 |
| 981 | Evaluating Multimodal Vision-Language Model Prompting Strategies for Visual Question Answering in Road Scene Understanding IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Understanding complex traffic scenes is a crucial challenge in advancing autonomous driving systems. Visual Question Answering (VQA) tasks have emerged as a promising approach to … |
Aryan Keskar; Srinivasa Perisetla; Ross Greer; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-28 |
| 982 | WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present WebFAQ, a large-scale collection of open-domain question answering datasets derived from FAQ-style schema.org annotations. |
Michael Dinzinger; Laura Caspari; Kanishka Ghosh Dastidar; Jelena Mitrović; Michael Granitzer; | arxiv-cs.CL | 2025-02-28 |
| 983 | Bisecting K-Means in RAG for Enhancing Question-Answering Tasks Performance in Telecommunications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents a novel Retrieval-Augmented Generation framework explicitly designed for the telecommunication domain, focusing on datasets composed of 3GPP documents. |
Pedro Sousa; Cláudio Klautau Mello; Frank B. Morte; Luis F. Solis Navarro; | arxiv-cs.IR | 2025-02-27 |
| 984 | Exploring Rewriting Approaches for Different Conversational Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we systematically investigate two different approaches, denoted as rewriting and fusion, on two fundamentally different generation tasks, including a text-to-text generation task and a multimodal generative task that takes as input text and generates a visualization or data table that answers the user’s question. |
MD MEHRAB TANJIM et. al. | arxiv-cs.CL | 2025-02-26 |
| 985 | Winning Big with Small Models: Knowledge Distillation Vs. Self-Training for Reducing Hallucination in Product QA Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The deployment of Large Language Models (LLMs) in customer support isconstrained by hallucination (generating false information) and the high costof proprietary models. To address these challenges, we propose aretrieval-augmented question-answering (QA) pipeline and explore how to balancehuman input and automation. |
ASHLEY LEWIS et. al. | arxiv-cs.CL | 2025-02-26 |
| 986 | Few-Shot Multilingual Open-Domain QA from 5 Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a \emph{few-shot learning} approach to synthesise large-scale multilingual data from large language models (LLMs). |
Fan Jiang; Tom Drummond; Trevor Cohn; | arxiv-cs.CL | 2025-02-26 |
| 987 | TRH2TQA: Table Recognition with Hierarchical Relationships to Table Question-Answering on Business Table Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite advancements in visual question answering, challenges persist with documents like financial reports, often structured in complicated tabular structures with complex … |
Pongsakorn Jirachanchaisiri; Nam Tuan Ly; Atsuhiro Takasu; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-26 |
| 988 | AdQuestA: Knowledge-Guided Visual Question Answer Framework for Advertisements Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the rapidly evolving landscape of digital marketing, effective customer engagement through advertisements is crucial for brands. Thus, computational understanding of ads is … |
NEHA CHOUDHARY et. al. | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-26 |
| 989 | Winning Big with Small Models: Knowledge Distillation Vs. Self-Training for Reducing Hallucination in QA Agents Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The deployment of Large Language Models (LLMs) in customer support is constrained by hallucination-generating false information-and the high cost of proprietary models. To address … |
ASHLEY LEWIS et. al. | ArXiv | 2025-02-26 |
| 990 | Towards A Multimodal Large Language Model with Pixel-Level Insight for Biomedicine IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce a novel end-to-end multimodal large language model for the biomedical domain, named MedPLIB, which possesses pixel-level understanding. |
XIAOSHUANG HUANG et. al. | aaai | 2025-02-25 |
| 991 | Patch-level Sounding Object Tracking for Audio-Visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a new Patch-level Sounding Object Tracking (PSOT) method. |
ZHANGBIN LI et. al. | aaai | 2025-02-25 |
| 992 | EPERM: An Evidence Path Enhanced Reasoning Model for Knowledge Graph Question and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, this paper reformulates the KGQA problem as a graphical model and proposes a three-stage framework named the Evidence Path Enhanced Reasoning Model (EPERM) for KGQA. |
Xiao Long; Liansheng Zhuang; Aodi Li; MingHong Yao; Shafei Wang; | aaai | 2025-02-25 |
| 993 | RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Compared with existing tabular question answering datasets, RETQA poses greater challenges due to three key factors: long-table structures, open-domain retrieval, and multi-domain queries. To tackle these challenges, we propose the SLUTQA framework, which integrates large language models with spoken language understanding tasks to enhance retrieval and answering accuracy. |
Zhensheng Wang; Wenmian Yang; Kun Zhou; Yiquan Zhang; Weijia Jia; | aaai | 2025-02-25 |
| 994 | Putting People in LLMs’ Shoes: Generating Better Answers Via Question Rewriter Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their effectiveness in QA is often undermined by the vagueness of user questions. To address this issue, we introduce single-round instance-level prompt optimization, referred to as question rewriter. |
Junhao Chen; Bowen Wang; Zhouqiang Jiang; Yuta Nakashima; | aaai | 2025-02-25 |
| 995 | COLUMBUS: Evaluating COgnitive Lateral Understanding Through Multiple-Choice ReBUSes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Effective problem-solving also necessitates lateral thinking, which remains understudied in AI and has not been used to test visual perception systems. To bridge this gap, we formulate visual lateral thinking as a multiple-choice question-answering task and describe a three-step taxonomy-driven methodology for instantiating task examples. |
Koen Kraaijveld; Yifan Jiang; Kaixin Ma; Filip Ilievski; | aaai | 2025-02-25 |
| 996 | Audio-Visual Adaptive Fusion Network for Question Answering Based on Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Secondly, the fusion of audio-visual information is often weighted inadequately, limiting model performance. To address the above issues, we design the Audio-Visual Adaptive Fusion Network (AVAF-Net), which uses contrastive learning to align audio-visual information temporally and spatially and adaptively adjusts fusion weights based on the question. |
Xujian Zhao; Yixin Wang; Peiquan Jin; | aaai | 2025-02-25 |
| 997 | Citations and Trust in LLM Generated Responses IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We explored trust through an anti-monitoring framework, where trust is predicted to be correlated with presence of citations and inversely related to checking citations. |
YIFAN DING et. al. | aaai | 2025-02-25 |
| 998 | Towards Robust Visual Question Answering Via Prompt-Driven Geometric Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current VQA models still struggle with the challenges of minority class collapse and spurious semantic correlations posed by language bias and imbalanced distributions. To address these challenges, this paper proposes a novel Prompt-Driven Geometric Harmonization (PDGH) paradigm, which integrates both geometric structure and information entropy principles to enhance the ability of VQA models to generalize effectively across diverse scenarios. |
YISHU LIU et. al. | aaai | 2025-02-25 |
| 999 | Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we focus on the problem of image hallucination, where images created by TTI models fail to faithfully depict factual content. |
Youngsun Lim; Hojun Choi; Hyunjung Shim; | aaai | 2025-02-25 |
| 1000 | When Open-Vocabulary Visual Question Answering Meets Causal Adapter: Benchmark and Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing VQA benchmarks predominantly adhere to a closed-set paradigm, limiting their ability to address arbitrary, unseen answers, and thus falling short in real-world scenarios. To address this limitation, we introduce the Open-Vocabulary Visual Question Answering (OVVQA) benchmark, specifically designed to evaluate models under open-world conditions by assessing their performance on both base classes (seen, common answers) and novel classes (unseen, rare answers). |
Feifei Zhang; Zhaoyi Zhang; Xi Zhang; Changsheng Xu; | aaai | 2025-02-25 |