Paper Digest: Recent Papers on Question Answering
Paper Digest Team extracted all recent Question Answering related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive daily paper digests on the latest research in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.
Experience the full potential of our services today!
TABLE 1: Paper Digest: Recent Papers on Question Answering
| Paper | Author(s) | Source | Date | |
|---|---|---|---|---|
| 1 | The Illusion of Certainty: Uncertainty Quantification for LLMs Fails Under Ambiguity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While real-world language is inherentlyambiguous, reflecting aleatoric uncertainty, existing UQ methods are typicallybenchmarked against tasks with no ambiguity. In this work, we demonstrate thatwhile current uncertainty estimators perform well under the restrictiveassumption of no ambiguity, they degrade to close-to-random performance onambiguous data. |
Tim Tomov; Dominik Fuchsgruber; Tom Wollschläger; Stephan Günnemann; | arxiv-cs.LG | 2025-11-06 |
| 2 | Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply SynthKGQA to Wikidata to generate GTSQA, a new datasetdesigned to test zero-shot generalization abilities of KG retrievers withrespect to unseen graph structures and relation types, and benchmark popularsolutions for KG-augmented LLMs on it. |
Alberto Cattaneo; Carlo Luschi; Daniel Justus; | arxiv-cs.LG | 2025-11-06 |
| 3 | Plan of Knowledge: Retrieval-Augmented Large Language Models for Temporal Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: LLMs frequently suffer from hallucination and a lack of knowledge. Toaddress these limitations, we propose the Plan of Knowledge framework with acontrastive temporal retriever, which is named PoK. |
XINYING QIAN et. al. | arxiv-cs.CL | 2025-11-06 |
| 4 | BanglaMedQA and BanglaMMedBench: Evaluating Retrieval-Augmented Generation Strategies for Bangla Biomedical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces BanglaMedQA andBanglaMMedBench, the first large-scale Bangla biomedical Multiple ChoiceQuestion (MCQ) datasets designed to evaluate reasoning and retrieval in medicalartificial intelligence (AI). |
Sadia Sultana; Saiyma Sittul Muna; Mosammat Zannatul Samarukh; Ajwad Abrar; Tareque Mohmud Chowdhury; | arxiv-cs.CL | 2025-11-06 |
| 5 | SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Video Question Answering (VideoQA) in the surgical domain aims to enhanceintraoperative understanding by enabling AI models to reason over temporallycoherent events rather than isolated frames. |
MAURO ORAZIO DRAGO et. al. | arxiv-cs.CV | 2025-11-05 |
| 6 | Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) struggle with this task, frequentlyfailing to interpret user intent (misinterpretation) or unnecessarily alteringthe original question’s structure (over-correction). We propose QuestionRAG, aframework that tackles these problems. |
Longpeng Qiu; Ting Li; Shuai Mao; Nan Yang; Xiaohui Yan; | arxiv-cs.CL | 2025-11-05 |
| 7 | Comparing The Performance of LLMs in RAG-based Question-Answering: A Case Study in Computer Science Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Evaluationmetrics employed in the study include accuracy and precision for binaryquestions and ranking by a human expert, ranking by Google’s AI model Gemini,alongside cosine similarity for long-answer questions. |
Ranul Dayarathne; Uvini Ranaweera; Upeksha Ganegoda; | arxiv-cs.CL | 2025-11-05 |
| 8 | ChiMDQA: Towards Comprehensive Chinese Document QA with Fine-grained Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the rapid advancement of natural language processing (NLP) technologies,the demand for high-quality Chinese document question-answering datasets issteadily growing. To address this issue, we present the Chinese Multi-DocumentQuestion Answering Dataset(ChiMDQA), specifically designed for downstreambusiness scenarios across prevalent domains including academic, education,finance, law, medical treatment, and news. |
Jing Gao; Shutiao Luo; Yumeng Liu; Yuanming Li; Hongji Zeng; | arxiv-cs.CL | 2025-11-05 |
| 9 | Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through attention analysis, we find that learning new knowledgereduces the model’s attention to key entities in the question, thus causingexcessive focus on the surrounding context, which may increase the risk ofhallucination. |
Renfei Dang; Peng Hu; Changjiang Gao; Shujian Huang; | arxiv-cs.CL | 2025-11-04 |
| 10 | DEEPAMBIGQA: Ambiguous Multi-hop Questions for Benchmarking LLM Answer Completeness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing QA benchmarks rarely evaluateboth challenges jointly. To address this, we introduce DeepAmbigQAGen, anautomatic data generation pipeline that constructs QA tasks grounded in textcorpora and linked knowledge graph, generating natural and verifiable questionsthat systematically embed name ambiguity and multi-step reasoning. |
Jiabao Ji; Min Li; Priyanshu Kumar; Shiyu Chang; Saloni Potdar; | arxiv-cs.CL | 2025-11-03 |
| 11 | When to Trust The Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Question Aligned Semantic Nearest NeighborEntropy (QA-SNNE), a black box uncertainty estimator that incorporates questionsemantics into prediction confidence. |
DENNIS PIERANTOZZI et. al. | arxiv-cs.CV | 2025-11-03 |
| 12 | Hybrid Retrieval-Augmented Generation Agent for Trustworthy Legal Question Answering in Judicial Forensics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As artificial intelligence permeates judicial forensics, ensuring theveracity and traceability of legal question answering (QA) has become critical.Conventional large language … |
YUEQING XI et. al. | arxiv-cs.AI | 2025-11-03 |
| 13 | Improving Construction Contract Question Answering Through Embedding Optimization and Semantic Chunking in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View |
Lanqian Zhang; Yan Ning; | Advanced Engineering Informatics | 2025-11-03 |
| 14 | A Graph-based RAG for Energy Efficiency Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the use of Large Language Models (LLMs) within agraph-based Retrieval Augmented Generation (RAG) architecture for EnergyEfficiency (EE) Question Answering. |
RICCARDO CAMPI et. al. | arxiv-cs.CL | 2025-11-03 |
| 15 | Text-VQA Aug: Pipelined Harnessing of Large Multimodal Models for Automated Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a pipeline for automated synthesis for text-VQA dataset thatcan produce faithful QA pairs, and which scales up with the availability ofscene text data. |
Soham Joshi; Shwet Kamal Mishra; Viswanath Gopalakrishnan; | arxiv-cs.CV | 2025-11-03 |
| 16 | StepSearch: Igniting LLMs Search Ability Via Step-Wise Proximal Policy Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous work has explored reinforcement learning (RL) to train LLMs to perform search-based document retrieval, achieving notable improvements in QA performance, but underperform on complex, multi-hop QA resulting from the sparse rewards from global signal only. To address this gap in existing research, we introduce StepSearch, a framework for search LLMs that trained with step-wise proximal policy optimization method. |
Xuhui Zheng; Kang An; Ziliang Wang; Yuhang Wang; Yichao Wu; | emnlp | 2025-11-02 |
| 17 | CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, identifying possible implicit assumptions is crucial in QA. To address this fundamental challenge, we propose Conditional Ambiguous Question-Answering (CondAmbigQA), a benchmark comprising 2,000 ambiguous queries and condition-aware evaluation metrics. |
Zongxi Li; Yang Li; Haoran Xie; S. Joe Qin; | emnlp | 2025-11-02 |
| 18 | CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Embodied Question Answering (EQA) has primarily focused on indoor environments, leaving the complexities of urban settings—spanning environment, action, and perception—largely unexplored. To bridge this gap, we introduce CityEQA, a new task where an embodied agent answers open-vocabulary questions through active exploration in dynamic city spaces. |
YONG ZHAO et. al. | emnlp | 2025-11-02 |
| 19 | TVQACML: Benchmarking Text-Centric Visual Question Answering in Multilingual Chinese Minority Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the open-source nature of these benchmarks and the broad sources of training data for MLLMs have inevitably led to benchmark contamination, resulting in unreliable evaluation results. To alleviate this issue, we propose a contamination-free and more challenging TEC-VQA benchmark called Text-Centric Visual Question Answering in Multilingual Chinese Minority Languages(TVQACML), which involves eight languages, including Standard Chinese, Korean, and six minority languages. |
Sha Jiu; Yu Weng; Mengxiao Zhu; Chong Feng; Zheng Liu; | emnlp | 2025-11-02 |
| 20 | Memory-QA: Answering Recall Questions Based on Multimodal Memories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This task poses unique challenges, including the creation of task-oriented memories, the effective utilization of temporal and location information within memories, and the ability to draw upon multiple memories to answer a recall question. To address these challenges, we propose a comprehensive pipeline, Pensieve, integrating memory-specific augmentation, time- and location-aware multi-signal retrieval, and multi-memory QA fine-tuning. |
HONGDA JIANG et. al. | emnlp | 2025-11-02 |
| 21 | KoBLEX: Open Legal Question Answering with Multi-hop Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these benchmarks fail to evaluate open-ended and provision-grounded Question Answering (QA). To address this, we introduce a Korean Benchmark for Legal EXplainable QA (KoBLEX), designed to evaluate provision-grounded, multi-hop legal reasoning. |
Jihyung Lee; Daehui Kim; Seonjeong Hwang; Hyounghun Kim; Gary Lee; | emnlp | 2025-11-02 |
| 22 | NitiBench: Benchmarking LLM Frameworks on Thai Legal Question Answering Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce NitiBench, a novel benchmark featuring two datasets: (1) NitiBench-CCL, covering Thai financial laws, and (2) NitiBench-Tax, containing Thailand’s official tax rulings. |
PAWITSAPAK AKARAJARADWONG et. al. | emnlp | 2025-11-02 |
| 23 | RJE: A Retrieval-Judgment-Exploration Framework for Efficient Knowledge Graph Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent research leverages large language models (LLMs) to enhance KGQA reasoning, but faces limitations: retrieval-based methods are constrained by the quality of retrieved information, while agent-based methods rely heavily on proprietary LLMs. To address these limitations, we propose Retrieval-Judgment-Exploration (RJE), a framework that retrieves refined reasoning paths, evaluates their sufficiency, and conditionally explores additional evidence. |
CAN LIN et. al. | emnlp | 2025-11-02 |
| 24 | Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce EverGreenQA, the first multilingual QA dataset with evergreen labels, supporting both evaluation and training. |
SERGEY PLETENEV et. al. | emnlp | 2025-11-02 |
| 25 | StepER: Step-wise Knowledge Distillation for Enhancing Reasoning Ability in Multi-Step Retrieval-Augmented Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing knowledge distillation methods overlook the need for different reasoning abilities at different steps, hindering transfer in multi-step retrieval-augmented frameworks. To address this, we propose Step-wise Knowledge Distillation for Enhancing Reasoning Ability in Multi-Step Retrieval-Augmented Language Models (StepER). |
Kyumin Lee; Minjin Jeon; Sanghwan Jang; Hwanjo Yu; | emnlp | 2025-11-02 |
| 26 | From Chat Logs to Collective Insights: Aggregative Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Aggregative Question Answering, a novel task requiring models to reason explicitly over thousands of user-chatbot interactions to answer aggregational queries, such as identifying emerging concerns among specific demographics. |
Wentao Zhang; Woojeong Kim; Yuntian Deng; | emnlp | 2025-11-02 |
| 27 | Generating Spatial Knowledge Graphs from Automotive Diagrams for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate three distinct generation pipelines (Per-Attribute, Per-Component, and a Single-Shot baseline) to create the SKG using Large Vision-Language Models (LVLMs). |
Steve Bakos; Chen Xing; Heidar Davoudi; Aijun An; Ron DiCarlantonio; | emnlp | 2025-11-02 |
| 28 | Faster In-Context Learning for LLMs Via N-Gram Trie Speculative Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the lengthy retrieved contexts and limited token throughput in autoregressive models significantly constrain reasoning speed. To address this challenge, we propose N-Gram Trie Speculative Decoding, a novel approach that leverages the overlap between context and model output. |
JINGLIN CHEN et. al. | emnlp | 2025-11-02 |
| 29 | ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While large language models (LLMs) have achieved substantial improvements via chain-of-thought (CoT) prompting and retrieval-augmented generation, these methods typically adopt a forward-only workflow—early mistakes persist throughout inference, and contradictions discovered later cannot systematically trigger re-evaluation. To address this limitation, we present ReAgent, a reversible multi-agent reasoning framework. |
ZHAO XINJIE et. al. | emnlp | 2025-11-02 |
| 30 | MAviS: A Multimodal Conversational Assistant For Avian Species Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing multimodal large language models (MM-LLMs) face challenges when it comes to specialized topics like avian species, making it harder to provide accurate and contextually relevant information in these areas. To address this limitation, we introduce the **MAviS-Dataset**, a large-scale multimodal avian species dataset that integrates image, audio, and text modalities for over 1,000 bird species, comprising both pretraining and instruction-tuning subsets enriched with structured question–answer pairs. |
YEVHENIIA KRYKLYVETS et. al. | emnlp | 2025-11-02 |
| 31 | FLARE: Faithful Logic-Aided Reasoning and Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Faithful Logic-Aided Reasoning and Exploration (FLARE), which uses LLMs to plan solutions, formalize queries into logic programs, and simulate code execution through multi-hop search without external solvers. |
Erik Arakelyan; Pasquale Minervini; Patrick Lewis; Pat Verga; Isabelle Augenstein; | emnlp | 2025-11-02 |
| 32 | Truth, Trust, and Trouble: Medical AI on The Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a rigorous benchmarking framework via a dataset of over 1,000 health questions. |
MOHAMMAD ANAS AZEEZ et. al. | emnlp | 2025-11-02 |
| 33 | ComplexTempQA: A 100m Dataset for Complex Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ComplexTempQA,a large-scale dataset consisting of over 100 million question-answer pairs designed to tackle the challenges in temporal question answering. |
Raphael Gruber; Abdelrahman Abdallah; Michael Färber; Adam Jatowt; | emnlp | 2025-11-02 |
| 34 | D-RAG: Differentiable Retrieval-Augmented Generation for Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the subgraph selection process is non-differentiable, preventing end-to-end training of the retriever and the generator in these approaches, which leads to sub-optimal performance. To overcome this limitation, this paper proposes a Differentiable RAG (D-RAG) approach that jointly optimizes the retriever and the generator for KGQA. |
GUANGZE GAO et. al. | emnlp | 2025-11-02 |
| 35 | CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they still face challenges in balancing retrieval precision and recall, impacting their efficacy in answering questions. To address this, we introduce **CAFE**, a two-stage coarse-to-fine method to enhance multi-document question-answering capacities. |
Han Peng; Jinhao Jiang; Zican Dong; Xin Zhao; Lei Fang; | emnlp | 2025-11-02 |
| 36 | Discrepancy Detection at The Data Level: Toward Consistent Multilingual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MIND, a user-in-the-loop fact-checking pipeline to detect factual and cultural discrepancies in multilingual QA knowledge bases. |
LORENA CALVO-BARTOLOMÉ et. al. | emnlp | 2025-11-02 |
| 37 | LiteraryQA: Towards Effective Evaluation of Long-document Narrative QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LiteraryQA, a high-quality subset of NarrativeQA focused on literary works. |
Tommaso Bonomo; Luca Gioffré; Roberto Navigli; | emnlp | 2025-11-02 |
| 38 | Static or Dynamic: Towards Query-Adaptive Token Selection for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel token selection strategy, explore-then-select, that adaptively adjusts static and dynamic information based on question requirements. |
Yumeng Shi; Quanyu Long; Wenya Wang; | emnlp | 2025-11-02 |
| 39 | OntologyRAG-Q: Resource Development and Benchmarking for Retrieval-Augmented Question Answering in Qur’anic Tafsir Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a comprehensive framework for handling sensitive Qur’anic Tafsir data that spans the entire pipeline from dataset construction through evaluation and error analysis. |
Sadam Al-Azani; Maad Alowaifeer; Alhanoof Alhunief; Ahmed Abdelali; | emnlp | 2025-11-02 |
| 40 | RTQA : Recursive Thinking for Complex Temporal Knowledge Graph Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current temporal knowledge graph question answering (TKGQA) methods primarily focus on implicit temporal constraints, lacking the capability to handle more complex temporal queries, and struggle with limited reasoning abilities and error propagation in decomposition frameworks. We propose RTQA, a novel framework to address these challenges by enhancing reasoning over TKGs without requiring training. |
ZHAOYAN GONG et. al. | emnlp | 2025-11-02 |
| 41 | TALON: A Multi-Agent Framework for Long-Table Exploration and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose TALON, a multi-agent framework designed for question answering over long tables. |
RUOCHUN JIN et. al. | emnlp | 2025-11-02 |
| 42 | TreeRare: Syntax Tree-Guided Retrieval and Reasoning for Knowledge-Intensive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the performance of such retrieval frameworks is limited by the accumulation of reasoning errors and misaligned retrieval results. To overcome these limitations, we propose TreeRare (Syntax Tree-Guided Retrieval and Reasoning, a framework that utilizes syntax trees to guide information retrieval and reasoning for question answering. |
Boyi Zhang; Zhuo Liu; Hangfeng He; | emnlp | 2025-11-02 |
| 43 | MEBench: Benchmarking Large Language Models for Cross-Document Multi-Entity Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although Large Language Models (LLMs) and Retrieval-augmented Generation (RAG) systems show promise, their performance on cross-document MEQA remains underexplored due to the absence of tailored benchmarks. To address this gap, we introduce MEBench, a scalable multi-document, multi-entity benchmark designed to systematically evaluate LLMs’ capacity to retrieve, consolidate, and reason over scattered and dense information. |
TENG LIN et. al. | emnlp | 2025-11-02 |
| 44 | XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This assumption neglects the cultural and regional variations that affect question understanding and answer, leading to biased evaluation in multilingual benchmarks. To address these limitations, we introduce XLQA, a novel benchmark explicitly designed for locale-sensitive multilingual ODQA. |
Keonwoo Roh; Yeong-Joon Ju; Seong-Whan Lee; | emnlp | 2025-11-02 |
| 45 | FacLens: Transferable Probe for Foreseeing Non-Factuality in Fact-Seeking Question Answering of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a lightweight model named Factuality Lens (FacLens), which effectively probes hidden representations of fact-seeking questions for the NFP task. |
YANLING WANG et. al. | emnlp | 2025-11-02 |
| 46 | Refining Attention for Explainable and Noise-Robust Fact-Checking with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conventional transformer-based models excel at classifying input data, but (i) often falter due to sensitivity to noise and (ii) lack explainability regarding their decision process. To address these challenges, we introduce ATTUN, a novel transformer architecture designed to enhance model transparency and resilience to noise by refining the attention mechanisms. |
Jean-Flavien Bussotti; Paolo Papotti; | emnlp | 2025-11-02 |
| 47 | ProtoVQA: An Adaptable Prototypical Framework for Explainable Fine-Grained Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ProtoVQA, a unified prototypical framework that (i) learns question-aware prototypes that serve as reasoning anchors, connecting answers to discriminative image regions, (ii) applies spatially constrained matching to ensure that the selected evidence is coherent and semantically relevant, and (iii) supports both answering and grounding tasks through a shared prototype backbone. |
XINGJIAN DIAO et. al. | emnlp | 2025-11-02 |
| 48 | T2: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: But they add human bias to the reasoning process and fail to leverage models’ inherent reasoning capabilities. To address these limitations, we present T2: Think-to-Think, a novel framework that dynamically adapts reasoning depth based on question complexity. |
ZHENGYI ZHAO et. al. | emnlp | 2025-11-02 |
| 49 | CompKBQA: Component-wise Task Decomposition for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the challenge of generating error-free logical forms remains, as skeleton, topic Entity, and relation Errors still frequently occur. To address these challenges, we propose CompKBQA(Component-wise Task Decomposition for Knowledge Base Question Answering), a novel framework that optimizes the process of fine-tuning a LLM for generating logical forms by enabling the LLM to progressively learn relevant sub-tasks like skeleton generation, topic entity generation, and relevant relations generation. |
YUHANG TIAN et. al. | emnlp | 2025-11-02 |
| 50 | BYOKG-RAG: Multi-Strategy Graph Retrieval for Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce BYOKG-RAG, a framework that enhances KGQA by synergistically combining LLMs with specialized graph retrieval tools. |
COSTAS MAVROMATIS et. al. | emnlp | 2025-11-02 |
| 51 | RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the lack of publicly available RAG-centric preference datasets and specialised RMs, we introduce RAGferee, a methodology that repurposes question-answering (QA) datasets into preference pairs that prioritise groundedness over stylistic features, enabling the training of contextual RMs better suited to judging RAG responses. |
ANDREI CATALIN COMAN et. al. | emnlp | 2025-11-02 |
| 52 | UNCLE: Benchmarking Uncertainty Expressions in Long-Form Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Along with UNCLE, we propose a suite of new metrics to assess the models’ capabilities to selectively express uncertainty. |
RUIHAN YANG et. al. | emnlp | 2025-11-02 |
| 53 | Weaver: Interweaving SQL and LLM for Table Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches that combine SQL and LLM typically rely on rigid, predefined workflows, limiting their adaptability to complex queries. To address these issues, we introduce Weaver, a modular pipeline that dynamically integrates SQL and LLM for table-based question answering (Table QA). |
Rohit Khoja; Devanshu Gupta; Yanjie Fu; Dan Roth; Vivek Gupta; | emnlp | 2025-11-02 |
| 54 | CoCoA: Confidence- and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CoCoA (Confidence- and Context-Aware Adaptive Decoding), a novel token-level algorithm for principled conflict resolution and enhanced faithfulness. |
Anant Khandelwal; Manish Gupta; Puneet Agrawal; | emnlp | 2025-11-02 |
| 55 | SportReason: Evaluating Retrieval-Augmented Reasoning Across Tables and Text for Sports Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present SportReason, a benchmark for retrieval-augmented reasoning on numerical sports questions. |
Kaiyue Feng; Siyue Zhang; Bingsen Chen; Yilun Zhao; Chen Zhao; | emnlp | 2025-11-02 |
| 56 | Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this survey, we propose a new structured taxonomy that categorizes the methodology of synthesizing LLMs and KGs for QA according to the categories of QA and the KG’s role when integrating with LLMs. |
Chuangtao Ma; Yongrui Chen; Tianxing Wu; Arijit Khan; Haofen Wang; | emnlp | 2025-11-02 |
| 57 | Answering Narrative-Driven Recommendation Queries Via A Retrieve–Rank Paradigm and The OCG-Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work formally introduces narrative recommendation as a distinct task and contends that the RAG paradigm is inherently ill-suited for it, owing to information loss in LLMs when retrieving information from from multiple long and fragmented contexts, and limitations in ranking effectiveness. |
YUNXIAO SHI et. al. | emnlp | 2025-11-02 |
| 58 | Confidence-guided Refinement Reasoning for Zero-shot Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Confidence-guided Refinement Reasoning (C2R), a novel training-free framework applicable to question-answering (QA) tasks across text, image, and video domains. |
Youwon Jang; Woo Suk Choi; Minjoon Jung; Minsu Lee; Byoung-Tak Zhang; | emnlp | 2025-11-02 |
| 59 | PakBBQ: A Culturally Adapted Bias Benchmark for QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most LLMs are trained and evaluated on Western centric data, with little attention paid to low-resource languages and regional contexts. To address this gap, we introduce PakBBQ, a culturally and regionally adapted extension of the original Bias Benchmark for Question Answering (BBQ) dataset. |
Abdullah Hashmat; Muhammad Arham Mirza; Agha Ali Raza; | emnlp | 2025-11-02 |
| 60 | RAVEN: Query-Guided Representation Alignment for Question Answering Over Audio, Video, Embedded Sensors, and Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present RAVEN, a unified QA architecture whose core is QuART, a query-conditioned cross-modal gating module that assigns scalar relevance scores to each token across modalities, enabling the model to amplify informative signals and suppress distractors before fusion. |
Subrata Biswas; Mohammad Nur Hossain Khan; Bashima Islam; | emnlp | 2025-11-02 |
| 61 | LaMP-QA: A Benchmark for Personalized Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is mainly due to lack of resources for training and evaluating personalized question answering systems. We address this gap by introducing LaMP-QA—a benchmark designed for evaluating personalized long-form answer generation. |
Alireza Salemi; Hamed Zamani; | emnlp | 2025-11-02 |
| 62 | TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The prevailing practice of flattening tables and chunking strategies disrupts the intrinsic tabular structure, leads to information loss, and undermines the reasoning capabilities of LLMs in multi-hop, global queries. To address these challenges, we propose TableRAG, an SQL-based framework that unifies textual understanding and complex manipulations over tabular data. |
Xiaohan Yu; Pu Jian; Chong Chen; | emnlp | 2025-11-02 |
| 63 | Don’t Forget The Base Retriever! A Low-Resource Graph-based Retriever for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GR\small{IEVER}, a lightweight, low-resource, multi-step graph-based retriever for multi-hop QA. |
ANDRE MELO et. al. | emnlp | 2025-11-02 |
| 64 | Retrieving Support to Rank Answers in Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel Question Answering (QA) architecture that enhances answer selection by retrieving targeted supporting evidence. |
Zeyu Zhang; Alessandro Moschitti; Thuy Vu; | emnlp | 2025-11-02 |
| 65 | What Are Foundation Models Cooking in The Post-Soviet World? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the Post-Soviet cultural food knowledge of foundation models by constructing BORSch, a multi-modal dataset encompassing 1147 and 823 dishes in the Russian and Ukrainian languages, centered around the Post-Soviet region. |
Anton Lavrouk; Tarek Naous; Alan Ritter; Wei Xu; | emnlp | 2025-11-02 |
| 66 | Tagging-Augmented Generation: Assisting Language Models in Finding Intricate Knowledge In Long Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Tagging-Augmented Generation (TAG), a lightweight data augmentation strategy that boosts LLM performance in long-context scenarios, without degrading and altering the integrity and composition of retrieved documents. |
ANWESAN PAL et. al. | emnlp | 2025-11-02 |
| 67 | SilVar: Speech-Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the quality of language models primarily depends on reasoning and prompting techniques, such as chain-of-thought, which remain underexplored when using speech instructions. To address these challenges, we propose SilVar, an end-to-end multimodal model that leverages speech instructions for reasoning-based visual question answering. |
Tan-Hanh Pham; Le Hoang Nam; Phu-Vinh Nguyen; Chris Ngo; Truong-Son Hy; | emnlp | 2025-11-02 |
| 68 | Trustworthy Medical Question Answering: An Evaluation-Centric Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this survey, we systematically examine six key dimensions of trustworthiness in medical QA, i. e. , Factuality, Robustness, Fairness, Safety, Explainability, and Calibration. |
YINUO WANG et. al. | emnlp | 2025-11-02 |
| 69 | How Accurate Are LLMs at Multi-Question Answering on Conversational Transcripts? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the capabilities of LLMs to answer multiple questions based on the same conversational context. |
Xiliang Zhu; Shi Zong; David Rossouw; | emnlp | 2025-11-02 |
| 70 | VinDr-CXR-VQA: A Visual Question Answering Dataset for Explainable Chest X-Ray Analysis with Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present VinDr-CXR-VQA, a large-scale chest X-ray dataset for explainableMedical Visual Question Answering (Med-VQA) with spatial grounding. |
Hai-Dang Nguyen; Ha-Hieu Pham; Hao T. Nguyen; Huy-Hieu Pham; | arxiv-cs.CV | 2025-11-01 |
| 71 | FARSIQA: Faithful and Advanced RAG System for Islamic Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing Retrieval-Augmented Generation (RAG)systems, relying on simplistic single-pass pipelines, fall short on complex,multi-hop queries requiring multi-step reasoning and evidence aggregation. Toaddress this gap, we introduce FARSIQA, a novel, end-to-end system for FaithfulAdvanced Question Answering in the Persian Islamic domain. |
Mohammad Aghajani Asl; Behrooz Minaei Bidgoli; | arxiv-cs.CL | 2025-10-29 |
| 72 | Beyond Long Context: When Semantics Matter More Than Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Clinical Entity Augmented Retrieval (CLEAR)method, introduced by Lopez et al. 2025, uses entity aware retrieval andachieved improved performance with an F1 score of 0.90 versus 0.86 forembedding based retrieval, while using over 70 percent fewer tokens. |
Tarun Kumar Chawdhury; Jon D. Duke; | arxiv-cs.CL | 2025-10-29 |
| 73 | Adapting Small Language Models to Low-Resource Domains: A Case Study in Hindi Tourism QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a multi-stagefinetuning strategy to adapt lightweight language models to the Hindi tourismdomain by leveraging both original and synthetic training data. |
Sandipan Majhi; Paheli Bhattacharya; | arxiv-cs.CL | 2025-10-29 |
| 74 | A Multimodal and Dynamically Updatable Benchmark for Aviation Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a multimodal, multi-level benchmark dataset tailored to aviation QA tasks, alongside an automated updating mechanism and a multi-dimensional evaluation framework. |
Liu He; Shuyan Liu; Xiaorui Qin; Ran An; Jianghui Zeng; | International Journal of Robotics and Automation Technology | 2025-10-29 |
| 75 | A Knowledge Graph Enhancement Technique for HIPAA Compliant Health Question Answering in Personal Health Libraries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, Knowledge Graphs (KGs) have proven effective for QA tasks, especially in extracting structured insights from text, but transforming free text into KGs often leads to information or context loss that can compromise answer accuracy. To overcome this challenge, we present a novel iterative and monotonic KG refinement technique that enriches knowledge representation without sacrificing contextual integrity. |
Hasan Jamil; | ACM Transactions on Computing for Healthcare | 2025-10-28 |
| 76 | Towards Complex Table Question Answering Over Tabular Data Lakes (Extended Version) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically analyze how LLMs paired with table retrievers can answer queries over private tabular data lakes. |
Daniela Risis; Jan-Micha Bodensohn; Matthias Urban; Carsten Binnig; | Datenbank-Spektrum | 2025-10-27 |
| 77 | IPQA: A Benchmark for Core Intent Identification in Personalized Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since users do not explicitly state their prioritizedintents, we derive core intents from observable behavior patterns in answerselection, grounded in satisficing theory where users choose answers meetingtheir acceptance thresholds. |
Jieyong Kim; Maryam Amirizaniani; Soojin Yoon; Dongha Lee; | arxiv-cs.CL | 2025-10-27 |
| 78 | Jarvis: Towards Personalized AI Assistant Via Personal KV-Cache Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid development of Vision-language models (VLMs) enables open-endedperception and reasoning. Recent works have started to investigate how to adaptgeneral-purpose VLMs into … |
BINXIAO XU et. al. | arxiv-cs.AI | 2025-10-26 |
| 79 | DMC$^3$: Dual-Modal Counterfactual Contrastive Construction for Egocentric Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To deal with thesechallenges, we propose a Dual-Modal Counterfactual Contrastive Construction(DMC$^3$) framework, which contains an egocentric videoqa baseline, acounterfactual sample construction module and a counterfactual sample-involvedcontrastive optimization. |
Jiayi Zou; Chaofan Chen; Bing-Kun Bao; Changsheng Xu; | arxiv-cs.CV | 2025-10-23 |
| 80 | GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose GlobalRAG, a reinforcementlearning framework designed to enhance global reasoning in multi-hop QA.GlobalRAG decomposes questions into subgoals, coordinates retrieval withreasoning, and refines evidence iteratively. |
JINCHANG LUO et. al. | arxiv-cs.CL | 2025-10-23 |
| 81 | Hierarchical Sequence Iteration for Heterogeneous Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Retrieval-augmented generation (RAG) remains brittle on multi-step questionsand heterogeneous evidence sources, trading accuracy against latency andtoken/tool budgets. This paper … |
Ruiyi Yang; Hao Xue; Imran Razzak; Hakim Hacid; Flora D. Salim; | arxiv-cs.CL | 2025-10-23 |
| 82 | VLSP 2025 MLQA-TSR Challenge: Vietnamese Multimodal Legal Question Answering on Traffic Sign Regulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the VLSP 2025 MLQA-TSR – the multimodal legal questionanswering on traffic sign regulation shared task at VLSP 2025. |
SON T. LUU et. al. | arxiv-cs.CL | 2025-10-23 |
| 83 | Multimedia-Aware Question Answering: A Review of Retrieval and Cross-Modal Reasoning Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this survey, we review recent advancements in QAsystems that integrate multimedia retrieval pipelines, focusing onarchitectures that align vision, language, and audio modalities with userqueries. |
Rahul Raja; Arpita Vats; | arxiv-cs.IR | 2025-10-23 |
| 84 | Task-guided Dynamic Visual Reasoning for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a task-guided dynamic visual reasoning method for visual question answering, which models the spatiotemporal states of objects in dynamic scenes, decomposes the questions into task steps, and finally deduces reasoning on the established spatiotemporal dynamic scene graph neural network. |
Yao Cong; Hongwei Mo; | International Journal of Humanoid Robotics | 2025-10-23 |
| 85 | Bridging Language Gaps with Adaptive RAG: Improving Indonesian Language Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome the limitedavailability of Indonesian language dataset, our study employs machinetranslation as data augmentation approach. |
William Christian; Daniel Adamlu; Adrian Yu; Derwin Suhartono; | arxiv-cs.CL | 2025-10-23 |
| 86 | That’s Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on prior question-answering (QA)research, we extend the investigation of knowledge conflicts to the realm ofcode generation. We propose a domain-agnostic framework for constructing andinterpreting such conflicts, along with a novel evaluation method and datasettailored to code conflict scenarios. |
JAESUNG BAE et. al. | arxiv-cs.CL | 2025-10-21 |
| 87 | Investigating LLM Capabilities on Long Context Comprehension for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We shed light into some of the evaluation aspects using amulti-faceted approach. |
Feras AlMannaa; Talia Tseriotou; Jenny Chim; Maria Liakata; | arxiv-cs.CL | 2025-10-21 |
| 88 | Explainable Bilingual Medical-Question-Answering Model Using Ensemble Learning Technique Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study establishes a foundation for building multilingual healthcare information systems, promoting inclusive and equitable access to medical information. |
Abdul Rahaman Wahab Sait; Yazeed Alkhurayyif; | Electronics | 2025-10-21 |
| 89 | AV-Master: Dual-Path Comprehensive Perception Makes Better Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This limits their reasoning capability in complex scenarios. Toaddress these challenges, we propose a novel framework named AV-Master. |
JIAYU ZHANG et. al. | arxiv-cs.CV | 2025-10-21 |
| 90 | IMB: An Italian Medical Benchmark for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present twocomprehensive Italian medical benchmarks: \textbf{IMB-QA}, containing 782,644patient-doctor conversations from 77 medical categories, and \textbf{IMB-MCQA},comprising 25,862 multiple-choice questions from medical specialtyexaminations. |
Antonio Romano; Giuseppe Riccio; Mariano Barone; Marco Postiglione; Vincenzo Moscato; | arxiv-cs.CL | 2025-10-21 |
| 91 | Interpretable Question Answering with Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a question answering system that operates exclusively ona knowledge graph retrieval without relying on retrieval augmented generation(RAG) with large language models (LLMs). |
Kartikeya Aneja; Manasvi Srivastava; Subhayan Das; Nagender Aneja; | arxiv-cs.CL | 2025-10-21 |
| 92 | From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Both issues can mislead reasoning and undermineanswer reliability. To address these challenges, we propose MedRGAG, a unifiedretrieval-generation augmented framework that seamlessly integrates externaland parametric knowledge for medical QA. |
Lei Li; Xiao Zhou; Yingying Zhang; Xian Wu; | arxiv-cs.CL | 2025-10-21 |
| 93 | ETVA: Evaluation of Text-to-Video Alignment Via Fine-grained Question Generation and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing text-to-video alignment metrics like CLIPScore only generate coarse-grained scores without fine-grained alignment details, failing to align with human preference. To address this limitation, we propose ETVA, a novel Evaluation method of Text-to-Video Alignment via fine-grained question generation and answering. |
KAISI GUAN et. al. | iccv | 2025-10-20 |
| 94 | ReasonVQA: A Multi-hop Reasoning Benchmark with Structural Knowledge for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new dataset, ReasonVQA, for the Visual Question Answering (VQA) task. |
Duong T. Tran; Trung-Kien Tran; Manfred Hauswirth; Danh Le Phuoc; | iccv | 2025-10-20 |
| 95 | Acknowledging Focus Ambiguity in Visual Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: No published work on visual question answering (VQA) accounts for ambiguity regarding where the content described in the question is located in the image. To fill this gap, we introduce VQ-FocusAmbiguity, the first VQA dataset that visually grounds each plausible image region a question could refer to when arriving at valid answers. |
Chongyan Chen; Yu-Yun Tseng; Zhuoheng Li; Anush Venkatesh; Danna Gurari; | iccv | 2025-10-20 |
| 96 | Overcoming Dual Drift for Continual Long-Tailed Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Continual Long-Tailed Visual Question Answering (CLT-VQA) and identify two critical challenges: inner-task prototype drift, where classifier prototypes become biased toward majority classes due to imbalanced data, and inter-task feature drift, where learned features shift over time, causing forgetting of previously learned knowledge. To address these challenges, we propose a unified dual-balance approach that integrates a Balanced Classifier Prototype (BCP) learning module and a Multi-modal Feature Alignment (MFA) module. |
Feifei Zhang; Zhihao Wang; Xi Zhang; Changsheng Xu; | iccv | 2025-10-20 |
| 97 | HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new task to benchmark human-in-scene understanding for embodied agents: Human-In-Scene Question Answering (HIS-QA). |
Jiahe Zhao; Ruibing Hou; Zejie Tian; Hong Chang; Shiguang Shan; | iccv | 2025-10-20 |
| 98 | AVAM: A Universal Training-free Adaptive Visual Anchoring Embedded Into Multimodal Large Language Model for Multi-image Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a straightforward yet universal Adaptive Visual Anchoring strategy, which can be seamlessly integrated into existing MLLMs, offering significant accuracy improvements through adaptive compression. |
Kang Zeng; Guojin Zhong; Jintao Cheng; Jin Yuan; Zhiyong Li; | iccv | 2025-10-20 |
| 99 | Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present QUestion-only replay with Attention Distillation (QUAD), a novel approach for VQACL that leverages only past task questions for regularization. |
Imad Eddine Marouf; Enzo Tartaglione; Stéphane Lathuilière; Joost Van De Weijer; | iccv | 2025-10-20 |
| 100 | TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given a video and a question, we generate an open-ended answer grounded with the start and end time. For this task, we propose TOGA: a vision-language model for Temporally Grounded Open-Ended Video QA with Weak Supervision. |
AYUSH GUPTA et. al. | iccv | 2025-10-20 |
| 101 | PVChat: Personalized Video Chat with One-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we introduce an automated augmentation pipeline that synthesizes identity-preserving positive samples and retrieves hard negatives from existing video corpora, generating a diverse training dataset with four QA types: existence, appearance, action, and location inquiries. |
YUFEI SHI et. al. | iccv | 2025-10-20 |
| 102 | Beyond The Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To improve exploration efficiency, we propose Fine-EQA, a hybrid exploration model that integrates frontier-based and goal-oriented navigation to guide agents toward task-relevant regions more effectively. |
KAIXUAN JIANG et. al. | iccv | 2025-10-20 |
| 103 | Object-centric Video Question Answering with Visual Grounding and Referring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing models primarily focus on high-level comprehension and are limited to text-only responses, restricting the flexibility for object-centric, multi-round interactions. In this paper, we make three contributions:(i) we address these limitations by introducing a VideoLLM, termed as **RGA3**, capable of performing both object referring and grounding for video reasoning tasks in a multi-round conversational manner, i.e., allowing users to iteratively interact with videos using both textual and visual queries; (ii) we propose **STOM** (Spatial-Temporal Overlay Module), a novel approach that allows arbitrary visual prompts to be processed at any timestamp within a video;(iii) we present **VideoInfer**, a manually curated object-centric video instruction dataset featuring question-answering pairs that require reasoning. |
HAOCHEN WANG et. al. | iccv | 2025-10-20 |
| 104 | SMR-agents: Synergistic Medical Reasoning Agents for Zero-shot Medical Visual Question Answering with MLLMs Related Papers Related Patents Related Grants Related Venues Related Experts View |
Dujuan Wang; Tao Cheng; Sutong Wang; Y. Chen; Yunqiang Yin; | Inf. Process. Manag. | |
| 105 | SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paperbridges the gap by presenting a novel framework. |
Xiongkun Linghu; Jiangyong Huang; Ziyu Zhu; Baoxiong Jia; Siyuan Huang; | arxiv-cs.CV | 2025-10-19 |
| 106 | SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present SQuAI (https://squai.scads.ai/), a scalable and trustworthymulti-agent retrieval-augmented generation (RAG) framework for scientificquestion answering (QA) with large language models (LLMs). |
Ines Besrour; Jingbo He; Tobias Schreieder; Michael Färber; | arxiv-cs.IR | 2025-10-17 |
| 107 | Prompt Design for Medical Question Answering with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View |
LEONID KULIGIN et. al. | Machine Learning with Applications | 2025-10-17 |
| 108 | DTKG: Dual-Track Knowledge Graph-Verified Reasoning Framework for Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These limitations deteriorate the efficiency andaccuracy for multi-hop QA tasks. To address this challenge, we propose a noveldual-track KG verification and reasoning framework DTKG, which is inspired bythe Dual Process Theory in cognitive science. |
CHANGHAO WANG et. al. | arxiv-cs.AI | 2025-10-17 |
| 109 | AutoGraph-R1: End-to-End Reinforcement Learning for Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, itseffectiveness is hindered by a fundamental disconnect: the knowledge graph (KG)construction process is decoupled from its downstream application, yieldingsuboptimal graph structures. To bridge this gap, we introduce AutoGraph-R1, thefirst framework to directly optimize KG construction for task performance usingReinforcement Learning (RL). |
HONG TING TSANG et. al. | arxiv-cs.CL | 2025-10-17 |
| 110 | Applications and Challenges of Retrieval-Augmented Generation (RAG) in Maternal Health: A Multi-Axial Review of The State of The Art in Biomedical QA with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this context, retrieval-augmented generation (RAG) systems provide a promising approach to enhance traceability, timeliness, and accuracy in tasks such as biomedical question answering (QA). This article presents a narrative and thematic review of the evolution of these technologies in maternal health, structured across five axes: technical foundations of RAG, advancements in biomedical LLMs, conversational agents in healthcare, clinical validation frameworks, and specific applications in obstetric telehealth. |
ADRIANA NOGUERA et. al. | Sci | 2025-10-16 |
| 111 | MedTrust-RAG: Evidence Verification and Trust Alignment for Biomedical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MedTrust-Guided Iterative RAG, aframework designed to enhance factual consistency and mitigate hallucinationsin medical QA. |
YINGPENG NING et. al. | arxiv-cs.CL | 2025-10-16 |
| 112 | Finding Answers in Thought Matters: Revisiting Evaluation on Large Language Models with Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, for models requiringreasoning, the method of answer extraction plays a critical role. Our researchreveals that the performance of reasoning models and their final answerdistributions are highly sensitive to the answer extraction algorithm employed.In order to mitigate this, we propose a basic framework: Answer Regeneration.The method uses an additional model inference, providing the prior input andoutput prefaced by the prompt Answer:. |
HWIYEOL JO et. al. | arxiv-cs.CL | 2025-10-16 |
| 113 | Interactive Environment-Aware Planning System and Dialogue for Social Robots in Early Childhood Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose an interactive environment-aware dialog and planning system for social robots in early childhood education, aimed at supporting the learning and social interaction of young children. |
Jiyoun Moon; Seung Min Song; | Applied Sciences | 2025-10-16 |
| 114 | PRISM: Agentic Retrieval with LLMs for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Weintroduce an Agentic Retrieval System that leverages large language models(LLMs) in a structured loop to retrieve relevant evidence with high precisionand recall. |
Md Mahadi Hasan Nahid; Davood Rafiei; | arxiv-cs.CL | 2025-10-16 |
| 115 | BioMedSearch: A Multi-Source Biomedical Retrieval Framework Based on LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To evaluate the accuracy ofquestion answering, we constructed a multi-level dataset, BioMedMCQs,consisting of 3,000 questions. |
CONGYING LIU et. al. | arxiv-cs.CL | 2025-10-15 |
| 116 | Who’s Asking? Evaluating LLM Robustness to Inquiry Personas in Factual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper,we present the first systematic evaluation of LLM robustness to inquirypersonas, i.e. user profiles that convey attributes like identity, expertise,or belief. |
Nil-Jana Akpinar; Chia-Jung Lee; Vanessa Murdock; Pietro Perona; | arxiv-cs.CL | 2025-10-14 |
| 117 | An Empirical Study for Representations of Videos in Video Question Answering Via MLLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present acomprehensive empirical study of video representation methods for VideoQA withMLLMs. |
Zhi Li; Yanan Wang; Hao Niu; Julio Vizcarra; Masato Taya; | arxiv-cs.IR | 2025-10-14 |
| 118 | ESI: Epistemic Uncertainty Quantification Via Semantic-preserving Intervention for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we establish a connection between the uncertainty ofLLMs and their invariance under semantic-preserving intervention from a causalperspective. |
Mingda Li; Xinyu Li; Weinan Zhang; Longxuan Ma; | arxiv-cs.CL | 2025-10-14 |
| 119 | Discrepancy Detection at The Data Level: Toward Consistent Multilingual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate MIND on a bilingualQA system in the maternal and infant health domain and release a dataset ofbilingual questions annotated for factual and cultural inconsistencies. |
LORENA CALVO-BARTOLOMÉ et. al. | arxiv-cs.CL | 2025-10-13 |
| 120 | VeritasFi: An Adaptable, Multi-tiered RAG Framework for Multi-modal Financial Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Retrieval-Augmented Generation (RAG) is becoming increasingly essential forQuestion Answering (QA) in the financial sector, where accurate andcontextually grounded insights from complex public disclosures are crucial.However, existing financial RAG systems face two significant challenges: (1)they struggle to process heterogeneous data formats, such as text, tables, andfigures; and (2) they encounter difficulties in balancing general-domainapplicability with company-specific adaptation. |
ZHENGHAN TAI et. al. | arxiv-cs.IR | 2025-10-12 |
| 121 | RIPRAG: Hack A Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, weinvestigate a more complex and realistic scenario: the attacker lacks knowledgeof the RAG system’s internal composition and implementation details, and theRAG system comprises components beyond a mere retriever. |
MENG XI et. al. | arxiv-cs.AI | 2025-10-11 |
| 122 | AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by how humans link informationassociatively, we propose AssoMem, a novel framework constructing anassociative memory graph that anchors dialogue utterances to automaticallyextracted clues. |
KAI ZHANG et. al. | arxiv-cs.CL | 2025-10-11 |
| 123 | LONGQAEVAL: Designing Reliable Evaluations of Long-Form Clinical QA Under Resource Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LongQAEval, an evaluation framework and set ofevaluation recommendations for limited-resource and high-expertise settings.Based on physician annotations of 300 real patient questions answered byphysicians and LLMs, we compare coarse answer-level versus fine-grainedsentence-level evaluation over the dimensions of correctness, relevance, andsafety. |
Federica Bologna; Tiffany Pan; Matthew Wilkens; Yue Guo; Lucy Lu Wang; | arxiv-cs.CL | 2025-10-11 |
| 124 | Closing The Data-Efficiency Gap Between Autoregressive and Masked Diffusion LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thisproposed method successfully and drastically improves the data efficiency ofarLLM fine-tuning, effectively closing the performance gap with dLLMs. |
Xu Pan; Ely Hahami; Jingxuan Fan; Ziqian Xie; Haim Sompolinsky; | arxiv-cs.CL | 2025-10-10 |
| 125 | NG-Router: Graph-Supervised Multi-Agent Collaboration for Nutrition Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To further address contextual overload, we propose agradient-based subgraph retrieval mechanism that identifies salient evidenceduring training, thereby enhancing multi-hop and relational reasoning.Extensive experiments across multiple benchmarks and backbone modelsdemonstrate that NG-Router consistently outperforms both single-agent andensemble baselines, offering a principled approach to domain-aware multi-agentreasoning for complex nutritional health tasks. |
KAIWEN SHI et. al. | arxiv-cs.CL | 2025-10-10 |
| 126 | Research on Sem-RAG: A Corn Planting Knowledge Question-Answering Algorithm Based on Fine-Grained Semantic Information Retrieval Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in knowledge-intensive domains such as agriculture, hallucination and insufficient retrieval accuracy remain challenging. To address these issues, we propose Sem-RAG, a corn planting knowledge question-answering algorithm based on fine-grained semantic retrieval enhancement. |
Bing Bai; Xiaoyan Meng; Chenzi Zhao; | Applied Sciences | 2025-10-09 |
| 127 | A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing models still struggle with questions that admit multiplevalid answers. Standard QA benchmarks, which typically assume a single goldanswer, overlook this reality and thus produce inappropriate training signals.Existing attempts to handle ambiguity often rely on costly manual annotation,which is difficult to scale to multi-hop datasets such as HotpotQA and MuSiQue.In this paper, we present A$^2$Search, an annotation-free, end-to-end trainingframework to recognize and handle ambiguity. |
FENGJI ZHANG et. al. | arxiv-cs.CL | 2025-10-09 |
| 128 | IDQuAD: Infectious Disease Question and Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In conclusion, this study introduces IDQuAD as a foundational dataset for infectious disease research, demonstrating the effectiveness of fine-tuning LLMs and paving the way for future advances in dataset development and LLM refinement for infectious disease tasks. |
Soonchan Kwon; Sujeong Hur; Beakcheol Jang; | PLOS One | 2025-10-09 |
| 129 | AI Knowledge Assist: An Automated Approach for The Creation of Knowledge Bases for Conversational AI Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end,we introduce AI Knowledge Assist, a system that extracts knowledge in the formof question-answer (QA) pairs from historical customer-agent conversations toautomatically build a knowledge base. |
Md Tahmid Rahman Laskar; Julien Bouvier Tremblay; Xue-Yong Fu; Cheng Chen; Shashi Bhushan TN; | arxiv-cs.CL | 2025-10-09 |
| 130 | Test-Time Reasoners Are Strategic Multiple-Choice Test-Takers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In all, we challenge claims that partial-input success isalways a flaw, so we discuss how reasoning traces could separate problematicdata from less problematic reasoning. |
Nishant Balepur; Atrey Desai; Rachel Rudinger; | arxiv-cs.CL | 2025-10-09 |
| 131 | Table Question Answering in The Era of Large Language Models: A Comprehensive Survey of Tasks, Methods, and Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, we highlight underexplored but timelytopics that have not been systematically covered in prior research. By unifyingdisparate research threads and identifying open problems, our survey offers aconsolidated foundation for the TQA community, enabling a deeper understandingof the state of the art and guiding future developments in this rapidlyevolving area. |
Wei Zhou; Bolei Ma; Annemarie Friedrich; Mohsen Mesgar; | arxiv-cs.CL | 2025-10-08 |
| 132 | SUBQRAG: Sub-question Driven Dynamic Graph Rag Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Graph Retrieval-Augmented Generation (Graph RAG) effectively builds aknowledge graph (KG) to connect disparate facts across a large document corpus.However, this broad-view … |
JIAOYANG LI et. al. | arxiv-cs.CL | 2025-10-08 |
| 133 | EverydayMMQA: A Multilingual and Multimodal Framework for Culturally Grounded Spoken Visual QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large-scale multimodal models achieve strong results on tasks like VisualQuestion Answering (VQA), but they often fail when queries require culturallygrounded, everyday knowledge, particularly in low-resource and underrepresentedlanguages. To bridge this gap, we introduce Everyday Multimodal andMultilingual QA (EverydayMMQA), a framework for creating large-scale,culturally-grounded datasets for spoken and visual question answering (SVQA). |
FIROJ ALAM et. al. | arxiv-cs.CL | 2025-10-07 |
| 134 | Asking For It: Question-Answering for Predicting Rule Infractions in Online Content Moderation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Online communities rely on a mix of platform policies and community-authoredrules to define acceptable behavior and maintain order. However, these rulesvary widely across … |
Mattia Samory; Diana Pamfile; Andrew To; Shruti Phadke; | arxiv-cs.CY | 2025-10-07 |
| 135 | Multi-Hop Question Answering: When Can Humans Help, and Where Do They Struggle? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better understand how humans might collaborate effectively withAI, we evaluate the performance of crowd workers on these individual reasoningsubtasks. We find that while humans excel at knowledge integration (97\%accuracy), they often fail to recognize when a question requires multi-hopreasoning (67\% accuracy). |
Jinyan Su; Claire Cardie; Jennifer Healey; | arxiv-cs.HC | 2025-10-06 |
| 136 | AgentRouter: A Knowledge-Graph-Guided LLM Router for Collaborative Multi-Agent Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose tAgentRouter, a framework thatformulates multi-agent QA as a knowledge-graph-guided routing problemsupervised by empirical performance signals. |
ZHEYUAN ZHANG et. al. | arxiv-cs.CL | 2025-10-06 |
| 137 | Video-in-the-Loop: Span-Grounded Long Video QA with Interleaved Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \emph{Video-in-the-Loop} (ViTL), a two-stage long-video QAframework that preserves a fixed token budget by first \emph{localizing}question-relevant interval(s) with a low-fps skim and then \emph{answering} viaspan-aware reallocation of visual tokens at higher effective frame rate,emitting an interleaved output with both spans and the final option for directattribution. |
CHENDONG WANG et. al. | arxiv-cs.CV | 2025-10-05 |
| 138 | Knowledge Graph-Guided Multi-Agent Distillation for Reliable Industrial Question Answering with Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Industrial question-answering (QA) systems require higher safety andreliability than general-purpose dialogue models, as errors in high-riskscenarios such as equipment fault diagnosis can have severe consequences.Although multi-agent large language models enhance reasoning depth, they sufferfrom uncontrolled iterations and unverifiable outputs, and conventionaldistillation methods struggle to transfer collaborative reasoning capabilitiesto lightweight, deployable student models. |
Jiqun Pan; Zhenke Duan; Jiani Tu; Anzhi Cheng; Yanqing Wang; | arxiv-cs.CL | 2025-10-03 |
| 139 | StepChain GraphRAG: Reasoning Over Knowledge Graphs for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, challengespersist in integrating iterative reasoning steps with external knowledgeretrieval. To address this, we introduce StepChain GraphRAG, a framework thatunites question decomposition with a Breadth-First Search (BFS) Reasoning Flowfor enhanced multi-hop QA. |
TENGJUN NI et. al. | arxiv-cs.CL | 2025-10-03 |
| 140 | LEAML: Label-Efficient Adaptation to Out-of-Distribution Visual Tasks for Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LEAML, a label-efficient adaptation framework thatleverages both scarce labeled VQA samples and abundant unlabeled images. |
Ci-Siang Lin; Min-Hung Chen; Yu-Yang Sheng; Yu-Chiang Frank Wang; | arxiv-cs.CV | 2025-10-03 |
| 141 | AccurateRAG: A Framework for Building Accurate Retrieval-Augmented Question-Answering Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce AccurateRAG — a novel framework for constructinghigh-performance question-answering applications based on retrieval-augmentedgeneration (RAG). |
LINH THE NGUYEN et. al. | arxiv-cs.CL | 2025-10-02 |
| 142 | Uncertainty As Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on UQfor the contextual QA task and propose a theoretically grounded approach toquantify epistemic uncertainty. |
YAVUZ BAKMAN et. al. | arxiv-cs.CL | 2025-10-02 |
| 143 | One More Question Is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Expert QuestionDecomposition (EQD), an approach designed to balance the use of domainknowledge with computational efficiency. |
Mengyu Wang; Sotirios Sabanis; Miguel de Carvalho; Shay B. Cohen; Tiejun Ma; | arxiv-cs.CL | 2025-10-01 |
| 144 | POVQA: Preference-Optimized Video Question Answering with Rationales for Data Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce POVQA, a data-efficient pipeline that compresses eachsecond of video into a single temporally pooled image (via motion blur andweighted averaging variants) and then align LVLMs with lightweight supervision.Concretely, we build 1 fps input sources using Blend Blur with Last Frame,Weighted Average, Exponential and Ramp pooling and fine-tune QWEN-2.5-VL 7Bwith supervised two turn target including reasoning and final answer. |
Ashim Dahal; Ankit Ghimire; Saydul Akbar Murad; Nick Rahimi; | arxiv-cs.CV | 2025-10-01 |
| 145 | TAG-EQA: Text-And-Graph for Event Question Answering Via Structured Prompting Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TAG-EQA (Text-And-Graph for Event QuestionAnswering), a prompting framework that injects causal event graphs into LLMinputs by converting structured relations into natural-language statements.TAG-EQA spans nine prompting configurations, combining three strategies(zero-shot, few-shot, chain-of-thought) with three input modalities (text-only,graph-only, text+graph), enabling a systematic analysis of when and howstructured knowledge aids inference. |
Maithili Kadam; Francis Ferraro; | arxiv-cs.CL | 2025-10-01 |
| 146 | Boosting Process-Correct CoT Reasoning By Modeling Solvability of Multiple-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study thisthrough multiple-choice question answering (MCQA), which provides a controlledsetting with fixed answer options. |
Raphael Schumann; Stefan Riezler; | arxiv-cs.AI | 2025-09-30 |
| 147 | A Multimodal LLM Approach for Visual Question Answering on Multiparametric 3D Brain MRI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce mpLLM, a prompt-conditioned hierarchical mixture-of-experts(MoE) architecture for visual question answering over multi-parametric 3D brainMRI (mpMRI). |
ARVIND MURARI VEPA et. al. | arxiv-cs.CV | 2025-09-30 |
| 148 | RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the lack of publicly availableRAG-centric preference datasets and specialised RMs, we introduce RAGferee, amethodology that repurposes question-answering (QA) datasets into preferencepairs that prioritise groundedness over stylistic features, enabling thetraining of contextual RMs better suited to judging RAG responses. |
ANDREI C. COMAN et. al. | arxiv-cs.CL | 2025-09-30 |
| 149 | Saliency Guided Longitudinal Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a saliency-guided encoder-decoder for chestX-ray Diff-VQA that turns post-hoc saliency into actionable supervision. |
Jialin Wu; Xiaofeng Liu; | arxiv-cs.AI | 2025-09-29 |
| 150 | Can VLM Pseudo-Labels Train A Time-Series QA Model That Outperforms The VLM? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Alternatively, with recent advancements inlarge-scale models, vision-language models (VLMs) have demonstrated thepotential to analyze time-series signals in a zero-shot manner. In this paper,we propose a training approach that uses pseudo labels generated by a VLM.Although VLMs can produce incorrect labels, TSQA models can still beeffectively trained based on the property that deep neural networks areinherently robust to such noisy labels. |
Takuya Fujimura; Kota Dohi; Natsuo Yamashita; Yohei Kawaguchi; | arxiv-cs.LG | 2025-09-29 |
| 151 | Knowledge Extraction on Semi-Structured Content: Does It Remain Relevant for Question Answering in The Era of LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The advent of Large Language Models (LLMs) has significantly advancedweb-based Question Answering (QA) systems over semi-structured content, raisingquestions about the continued utility of knowledge extraction for questionanswering. This paper investigates the value of triple extraction in this newparadigm by extending an existing benchmark with knowledge extractionannotations and evaluating commercial and open-source LLMs of varying sizes.Our results show that web-scale knowledge extraction remains a challenging taskfor LLMs. |
KAI SUN et. al. | arxiv-cs.CL | 2025-09-29 |
| 152 | Beyond Isolated Facts: Synthesizing Narrative and Grounded Supervision for VideoQA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To move beyond thisparadigm, we introduce a framework to synthesize richer supervisory signals. |
JIANXIN LIANG et. al. | arxiv-cs.CV | 2025-09-29 |
| 153 | Do LLMs Understand Romanian Driving Laws? A Study on Multimodal and Fine-Tuned Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluates Large Language Models (LLMs)on Romanian driving-law QA with explanation generation. |
Eduard Barbu; Adrian Marius Dumitran; | arxiv-cs.CL | 2025-09-28 |
| 154 | JGU Mainz’s Submission to The WMT25 Shared Task on LLMs with Limited Resources for Slavic Languages: MT and QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the JGU Mainz submission to the WMT25 Shared Task on LLMswith Limited Resources for Slavic Languages: Machine Translation and QuestionAnswering, focusing on Ukrainian, Upper Sorbian, and Lower Sorbian. |
Hossain Shaikh Saadi; Minh Duc Bui; Mario Sanz-Guerrero; Katharina von der Wense; | arxiv-cs.CL | 2025-09-26 |
| 155 | From Evidence to Trajectory: Abductive Reasoning Path Synthesis for Training Retrieval-Augmented Generation Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, wepropose EviPath, an evidence-anchored reasoning path synthesis paradigm for RAGagent development. |
MUZHI LI et. al. | arxiv-cs.CL | 2025-09-26 |
| 156 | Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this,we introduce Composite Reasoning (CR), a novel reasoning approach empoweringLLMs to dynamically explore and combine multiple reasoning styles likedeductive, inductive, and abductive for more nuanced problem-solving. |
Zishan Ahmad; Saisubramaniam Gopalakrishnan; | arxiv-cs.CL | 2025-09-26 |
| 157 | A Comprehensive Evaluation of Transformer-Based Question Answering Models and RAG-Enhanced Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive evaluation ofretrieval strategies for multi-hop question answering within aretrieval-augmented generation framework. |
Zichen Zhang; Kunlong Zhang; Hongwei Ruan; Yiming Luo; | arxiv-cs.CV | 2025-09-26 |
| 158 | KnowMT-Bench: Benchmarking Knowledge-Intensive Long-Form Question Answering in Multi-Turn Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The factual capability and information deliveryefficiency of the \textit{final-turn} answer are then evaluated using ahuman-validated automated pipeline. |
JUNHAO CHEN et. al. | arxiv-cs.CL | 2025-09-26 |
| 159 | MIRAGE: Multi-hop Reasoning with Ambiguity Evaluation for Illusory Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Toestablish a robust baseline, we propose CLarifying Ambiguity with a Reasoningand InstructiON (CLARION), a multi-agent framework that significantlyoutperforms existing approaches on MIRAGE, paving the way for more adaptive androbust reasoning systems. |
JEONGHYUN PARK et. al. | arxiv-cs.CL | 2025-09-26 |
| 160 | Detecting (Un)answerability in Large Language Models with Linear Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thiswork, we study the problem of (un)answerability detection, focusing onextractive question answering (QA) where the model should determine if apassage contains sufficient information to answer a given question. |
Maor Juliet Lavi; Tova Milo; Mor Geva; | arxiv-cs.CL | 2025-09-26 |
| 161 | Do LLM Agents Know How to Ground, Recover, and Assess? A Benchmark for Epistemic Competence in Information-Seeking Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent work has explored training Large Language Model (LLM) search agentswith reinforcement learning (RL) for open-domain question answering (QA). |
Jiaqi Shao; Yuxiang Lin; Munish Prasad Lohani; Yufeng Miao; Bing Luo; | arxiv-cs.AI | 2025-09-26 |
| 162 | Beyond Stars: Bridging The Gap Between Ratings and Review Sentiment with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an advanced approach to mobile app review analysis aimed ataddressing limitations inherent in traditional star-rating systems. |
Najla Zuhir; Amna Mohammad Salim; Parvathy Premkumar; Moshiur Farazi; | arxiv-cs.AI | 2025-09-25 |
| 163 | SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, thecaptions frequently include excessive noise irrelevant to the question, andLLMs generally do not comprehend VQA tasks, limiting their reasoningcapabilities. To address this issue, we propose the Summarized Caption-RerankAugmented VQA (SCRA-VQA), which employs a pre-trained visual language model toconvert images into captions. |
YAN ZHANG et. al. | arxiv-cs.CV | 2025-09-25 |
| 164 | LOCA: Logical Chain Augmentation for Scientific Corpus Cleaning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existingscientific question-answering (QA) datasets suffer from high error rates,frequently resulting from logical leaps and implicit reasoning within theanswers. To address this issue, we introduce LOCA (Logical Chain Augmentation),a novel framework for automatically cleaning scientific corpora, implementedthrough an augment-and-review loop. |
YOU-LE FANG et. al. | arxiv-cs.CL | 2025-09-24 |
| 165 | Consistency-Aware Parameter-Preserving Knowledge Editing Framework for Multi-Hop Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Parameter-Preserving Knowledge Editing (PPKE) enables updating models withnew or corrected information without retraining or parameter adjustment. RecentPPKE approaches based on … |
Lingwen Deng; Yifei Han; Long Zhang; Yue Du; Bin Li; | arxiv-cs.CL | 2025-09-23 |
| 166 | Are Smaller Open-Weight LLMs Closing The Gap to Proprietary Models for Biomedical Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we compare several open-weight modelsagainst top-performing systems such as GPT-4o, GPT-4.1, Claude 3.5 Sonnet, andClaude 3.7 Sonnet. |
Damian Stachura; Joanna Konieczna; Artur Nowak; | arxiv-cs.CL | 2025-09-23 |
| 167 | Pathways of Thoughts: Multi-Directional Thinking for Long-form Personalized Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, personalized QA remains relatively underexplored due tochallenges such as inferring preferences from long, noisy, and implicitcontexts, and generating responses that are simultaneously correct,contextually appropriate, and aligned with user expectations and backgroundknowledge. To address these challenges, we propose Pathways of Thoughts (PoT),an inference-stage method that applies to any large language model (LLM)without requiring task-specific fine-tuning. |
ALIREZA SALEMI et. al. | arxiv-cs.CL | 2025-09-23 |
| 168 | NeuS-QA: Grounding Long-Form Video Understanding in Temporal Logic and Neuro-Symbolic Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As aresult, there are no formal guarantees that the sampled context actuallyencodes the compositional or causal logic demanded by the question. To addressthese foundational gaps, we introduce NeuS-QA, a training-free, plug-and-playneuro-symbolic pipeline for LVQA. |
SAHIL SHAH et. al. | arxiv-cs.CV | 2025-09-22 |
| 169 | Memory-QA: Answering Recall Questions Based on Multimodal Memories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This task poses unique challenges, including the creation oftask-oriented memories, the effective utilization of temporal and locationinformation within memories, and the ability to draw upon multiple memories toanswer a recall question. To address these challenges, we propose acomprehensive pipeline, Pensieve, integrating memory-specific augmentation,time- and location-aware multi-signal retrieval, and multi-memory QAfine-tuning. |
HONGDA JIANG et. al. | arxiv-cs.AI | 2025-09-22 |
| 170 | Semantic Reformulation Entropy for Robust Hallucination Detection in QA Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Semantic Reformulation Entropy (SRE), whichimproves uncertainty estimation in two ways. |
CHAODONG TONG et. al. | arxiv-cs.CL | 2025-09-22 |
| 171 | Towards Adaptive Context Management for Intelligent Conversational Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This particular paper introduces an Adaptive Context Management (ACM)framework for the Conversational Question Answering (ConvQA) systems. |
Manoj Madushanka Perera; Adnan Mahmood; Kasun Eranda Wijethilake; Quan Z. Sheng; | arxiv-cs.CL | 2025-09-22 |
| 172 | LLaVul: A Multimodal LLM for Interpretable Vulnerability Reasoning About Source Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our model is trained to integrate paired codeand natural queries into a unified space, enhancing reasoning andcontext-dependent insights about code vulnerability. |
Ala Jararweh; Michael Adams; Avinash Sahu; Abdullah Mueen; Afsah Anwar; | arxiv-cs.AI | 2025-09-21 |
| 173 | AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AirQA, a human-annotatedcomprehensive paper QA dataset in the field of artificial intelligence (AI),with 13,948 papers and 1,246 questions, that encompasses multi-task,multi-modal and instance-level evaluation. |
TIANCHENG HUANG et. al. | arxiv-cs.CL | 2025-09-21 |
| 174 | Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent speech-LLMs have shown impressive performance in tasks liketranscription and translation, yet they remain limited in understanding theparalinguistic aspects of speech … |
QIONGQIONG WANG et. al. | arxiv-cs.CL | 2025-09-20 |
| 175 | Comparing RAG and GraphRAG for Page-Level Retrieval Question Answering on Math Textbook Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Overall, this study highlights both the promises andchallenges of page-level retrieval systems in educational contexts, emphasizingthe need for more refined retrieval methods to build reliable AI tutoringsolutions in providing reference page numbers. |
EASON CHEN et. al. | arxiv-cs.IR | 2025-09-20 |
| 176 | Question Answering with LLMs and Learning from Answer Sets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce LLM2LAS, ahybrid system that effectively combines the natural language understandingcapabilities of LLMs, the rule induction power of the Learning from Answer Sets(LAS) system ILASP, and the formal reasoning strengths of Answer SetProgramming (ASP). |
MANUEL BORROTO et. al. | arxiv-cs.AI | 2025-09-20 |
| 177 | Time to Revist Exact Match Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TempAnswerQA, abenchmark distilled from Test of Time and TempTabQA, where all questionsrequire a numerical, temporal answer, allowing us to evaluate models beyond EM.We use the forecasting metrics symmetric mean absolute percentage error (sMAPE)and mean absolute scaled error (MASE). |
Auss Abbood; Zaiqiao Meng; Nigel Collier; | arxiv-cs.CL | 2025-09-20 |
| 178 | Jamendo-QA: A Large-Scale Music Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Jamendo-QA, a large-scale dataset for Music Question Answering(Music-QA). |
Junyoung Koh; Soo Yong Kim; Yongwon Choi; Gyu Hyeong Choi; | arxiv-cs.MM | 2025-09-19 |
| 179 | SWE-QA: Can Language Models Answer Repository-level Code Questions? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thispaper, we present SWE-QA, a repository-level code question answering (QA)benchmark designed to facilitate research on automated QA systems in realisticcode environments. |
WEIHAN PENG et. al. | arxiv-cs.CL | 2025-09-18 |
| 180 | Quantifying Uncertainty in Natural Language Explanations of Large Language Models for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Notably, generatingvalid uncertainty estimates for natural language explanations is particularlychallenging due to the auto-regressive generation process of LLMs and thepresence of noise in medical inquiries. To bridge this gap, in this work, wefirst propose a novel uncertainty estimation framework for these generatednatural language explanations, which provides valid uncertainty guarantees in apost-hoc and model-agnostic manner. |
Yangyi Li; Mengdi Huai; | arxiv-cs.CL | 2025-09-18 |
| 181 | HistoryBankQA: Multilingual Temporal Question Answering on Historical Events Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing temporal reasoning datasetsare limited in scale, lack multilingual coverage and focus more on contemporaryevents. To address these limitations, we present HistoryBank, a multilingualdatabase of 10M+ historical events extracted from Wikipedia timeline pages andarticle infoboxes. |
Biswadip Mandal; Anant Khandelwal; Manish Gupta; | arxiv-cs.CL | 2025-09-16 |
| 182 | AQUA-LLM: Evaluating Accuracy, Quantization, and Adversarial Robustness Trade-offs in LLMs for Cybersecurity Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose AQUA-LLM, an evaluationframework designed to benchmark several state-of-the-art small LLMs under fourdistinct configurations: base, quantized-only, fine-tuned, and fine-tunedcombined with quantization, specifically for cybersecurity QA. |
Onat Gungor; Roshan Sood; Harold Wang; Tajana Rosing; | arxiv-cs.CR | 2025-09-16 |
| 183 | Graph-Enhanced Retrieval-Augmented Question Answering for E-Commerce Customer Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper develops a novelretrieval-augmented generation (RAG) framework that uses knowledge graphs (KGs)to improve the relevance of the answer and the factual grounding. |
Piyushkumar Patel; | arxiv-cs.CL | 2025-09-15 |
| 184 | Bridging Vision Language Models and Symbolic Grounding for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study symbolic scene graphs(SGs) as intermediate grounding signals for VQA. |
Haodi Ma; Vyom Pathak; Daisy Zhe Wang; | arxiv-cs.CV | 2025-09-15 |
| 185 | ParaEQsA: Parallel and Asynchronous Embodied Questions Scheduling and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper formulates the Embodied Questions Answering (EQsA) problem,introduces a corresponding benchmark, and proposes a system to tackle theproblem. |
Haisheng Wang; Weiming Zhi; | arxiv-cs.RO | 2025-09-15 |
| 186 | FineQuest: Adaptive Knowledge-Assisted Sports Video Understanding Via Agent-of-Thoughts Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work,we propose FineQuest, the first training-free framework that leveragesdual-mode reasoning inspired by cognitive science: i) Reactive Reasoning forstraightforward sports queries and ii) Deliberative Reasoning for more complexones. |
Haodong Chen; Haojian Huang; XinXiang Yin; Dian Shao; | arxiv-cs.CV | 2025-09-15 |
| 187 | AgenticIE: An Adaptive Agent for Information Extraction from Complex Regulatory Documents Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Declaration of Performance (DoP) documents, mandated by EU regulation,certify the performance of construction products. There are two challenges tomake DoPs machine and human … |
Gaye Colakoglu; Gürkan Solmaz; Jonathan Fürst; | arxiv-cs.CL | 2025-09-15 |
| 188 | MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thiswork, we introduce MORQA (Medical Open-Response QA), a new multilingualbenchmark designed to assess the effectiveness of NLG evaluation metrics acrossthree medical visual and text-based QA datasets in English and Chinese. |
WEN-WAI YIM et. al. | arxiv-cs.CL | 2025-09-15 |
| 189 | Improving LLMs’ Learning for Coreference Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, weinvestigate the limitations of existing LLM-based approaches to CR-specificallythe Question-Answering (QA) Template and Document Template methods and proposetwo novel techniques: Reversed Training with Joint Inference and IterativeDocument Generation. |
Yujian Gan; Yuan Liang; Yanni Lin; Juntao Yu; Massimo Poesio; | arxiv-cs.CL | 2025-09-14 |
| 190 | !MSA at AraHealthQA 2025 Shared Task: Enhancing LLM Performance for Arabic Clinical Question Answering Through Prompt Engineering and Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our systems for Track 2 (General Arabic Health QA, MedArabiQ) ofthe AraHealthQA-2025 shared task, where our methodology secured 2nd place inboth Sub-Task 1 (multiple-choice question answering) and Sub-Task 2 (open-endedquestion answering) in Arabic clinical contexts. |
Mohamed Tarek; Seif Ahmed; Mohamed Basem; | arxiv-cs.CL | 2025-09-14 |
| 191 | Evaluating Large Language Models for Evidence-Based Clinical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have demonstrated substantial progress inbiomedical and clinical applications, motivating rigorous evaluation of theirability to answer nuanced, evidence-based questions. |
Can Wang; Yiqun Chen; | arxiv-cs.CL | 2025-09-13 |
| 192 | Constructing A Question-Answering Simulator Through The Distillation of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, wepropose a method named LLM Distillation based Simulator (LDSim), which distillsdomain knowledge and reasoning capability from an LLM to better assistprediction, thereby improving simulation performance. |
Haipeng Liu; Ting Long; Jing Fu; | arxiv-cs.LG | 2025-09-11 |
| 193 | Agentic LLMs for Question Answering Over Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a Natural Language to SQL (NL-to-SQL) approachleveraging large language models (LLMs) such as GPT-4o, GPT-4o-mini, andDeepSeek v2:16b to generate SQL queries dynamically. |
Rishit Tyagi; Mohit Gupta; Rahul Bouri; | arxiv-cs.CL | 2025-09-11 |
| 194 | A Knowledge Noise Mitigation Framework for Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Knowledge-based visual question answering (KB-VQA) requires a model tounderstand images and utilize external knowledge to provide accurate answers.Existing approaches often directly augment models with retrieved informationfrom knowledge sources while ignoring substantial knowledge redundancy, whichintroduces noise into the answering process. To address this, we propose atraining-free framework with knowledge focusing for KB-VQA, that mitigates theimpact of noise by enhancing knowledge relevance and reducing redundancy.First, for knowledge retrieval, our framework concludes essential parts fromthe image-question pairs, creating low-noise queries that enhance the retrievalof highly relevant knowledge. |
Zhiyue Liu; Sihang Liu; Jinyuan Liu; Xinru Zhang; | arxiv-cs.CV | 2025-09-11 |
| 195 | Retrieval-Augmented Generation for Reliable Interpretation of Radio Regulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study question answering in the domain of radio regulations, a legallysensitive and high-stakes area. We propose a telecom-specificRetrieval-Augmented Generation (RAG) pipeline and introduce, to our knowledge,the first multiple-choice evaluation set for this domain, constructed fromauthoritative sources using automated filtering and human validation. |
Zakaria El Kassimi; Fares Fourati; Mohamed-Slim Alouini; | arxiv-cs.IR | 2025-09-11 |
| 196 | Fusing Knowledge and Language: A Comparative Study of Knowledge Graph-Based Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While traditional Retrieval Augmented Generation(RAG) approaches are proficient in fact-based and local context-basedextraction from concise texts, they encounter limitations when addressing thethematic and holistic understanding of complex, extensive texts, requiring adeeper analysis of both text and context. This paper presents a comprehensivetechnical comparative study of three different methodologies for constructingknowledge graph triplets and integrating them with Large Language Models (LLMs)for question answering: spaCy, Stanford CoreNLP-OpenIE, and GraphRAG, allleveraging open source technologies. |
Vaibhav Chaudhary; Neha Soni; Narotam Singh; Amita Kapoor; | arxiv-cs.AI | 2025-09-11 |
| 197 | LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore how largelanguage models (LLMs) can be used for information retrieval (IR), and anensemble of zero-shot models can accomplish state-of-the-art performance on adomain-specific Yes/No QA task. |
Dima Galat; Diego Molla-Aliod; | arxiv-cs.CL | 2025-09-10 |
| 198 | A Role-Aware Multi-Agent Framework for Financial Education Question Answering with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluated our framework on a set of3,532 expert-designed finance education questions from Study.com, an onlinelearning platform. |
Andy Zhu; Yingjun Du; | arxiv-cs.CL | 2025-09-10 |
| 199 | TextlessRAG: End-to-End Visual Document RAG By Speech Without Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose TextlessRAG, thefirst end-to-end framework for speech-based question answering over large-scaledocument images. |
PEIJIN XIE et. al. | arxiv-cs.CV | 2025-09-09 |
| 200 | Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address thisissue, we propose a novel Iterative Retrieval-Augmented Knowledge Editingmethod with guided decomposition (IRAKE) through the guidance from singleedited facts and entire edited cases. |
Yi Liu; Xiangrong Zhu; Xiangyu Liu; Wei Wei; Wei Hu; | arxiv-cs.CL | 2025-09-09 |
| 201 | The Role of Exploration Modules in Small Language Models for Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thisstudy, we investigate the capabilities of existing integration methods forsmall language models (SLMs) in KG-based question answering and observe thattheir performance is often constrained by their limited ability to traverse andreason over knowledge graphs. To address this limitation, we propose leveragingsimple and efficient exploration modules to handle knowledge graph traversal inplace of the language model itself. |
Yi-Jie Cheng; Oscar Chew; Yun-Nung Chen; | arxiv-cs.CL | 2025-09-09 |
| 202 | HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results demonstrate that our framework obtains superiorperformance in both single-hop and multi-hop question-answering tasks. |
DUOLIN SUN et. al. | arxiv-cs.CL | 2025-09-08 |
| 203 | Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our resultsreveal a notable performance gap between human level scores and VLMperformance, highlighting that current VLMs still fall short of human levelspatial understanding. To bridge this gap, we propose Ego3D-VLM, apost-training framework that enhances 3D spatial reasoning of VLMs. |
MOHSEN GHOLAMI et. al. | arxiv-cs.CV | 2025-09-07 |
| 204 | A Survey of The State-of-the-Art in Conversational Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Overall, this work offers acomprehensive overview of the ConvQA landscape and provides valuable insightsto guide future advancements in the field. |
MANOJ MADUSHANKA PERERA et. al. | arxiv-cs.CL | 2025-09-06 |
| 205 | KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present KERAG, a novel KG-based RAG pipeline thatenhances QA coverage by retrieving a broader subgraph likely to containrelevant information. |
YUSHI SUN et. al. | arxiv-cs.CL | 2025-09-04 |
| 206 | Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: WhenLLMs memorize outdated medical knowledge, they can provide harmful advice orfail at clinical reasoning tasks. To investigate this problem, we introduce twonovel question-answering (QA) datasets derived from systematic reviews:MedRevQA (16,501 QA pairs covering general biomedical knowledge) andMedChangeQA (a subset of 512 QA pairs where medical consensus has changed overtime). |
Juraj Vladika; Mahdi Dhaini; Florian Matthes; | arxiv-cs.CL | 2025-09-04 |
| 207 | CMRAG: Co-modality-based Visual Document Retrieval and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods have limitations whendealing with multimodal documents: one category of methods relies on layoutanalysis and text extraction, which can only utilize explicit text informationand struggle to capture images or unstructured content; the other categorytreats document segmentation as visual input and directly passes it to visuallanguage models (VLMs) for processing, yet it ignores the semantic advantagesof text, leading to suboptimal retrieval and generation results. To addressthese research gaps, we propose the Co-Modality-based RAG (CMRAG) framework,which can simultaneously leverage texts and images for more accurate retrievaland generation. |
WANG CHEN et. al. | arxiv-cs.CL | 2025-09-02 |
| 208 | Understanding Question-answering Systems: Evolution, Applications, Trends, and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View |
Amer Farea; Frank Emmert-Streib; | Eng. Appl. Artif. Intell. | 2025-09-01 |
| 209 | Bio-inspired Product Design System Integrating Retrieval-augmented Question Answering and Semantic Fusion Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xinhui Kang; Wenjie You; Ying Luo; | Adv. Eng. Informatics | 2025-09-01 |
| 210 | Decomposing and Revising What Language Models Generate Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the generatedquestions are often irrelevant and incomplete, resulting in a loss of facts inretrieval.These approaches also fail to aggregate evidence snippets fromdifferent documents and paragraphs. To tackle these problems, we propose a newfact decomposition-based framework called FIDES (\textit{faithful contextenhanced fact decomposition and evidence aggregation}) for attributed QA. |
ZHICHAO YAN et. al. | arxiv-cs.CL | 2025-08-31 |
| 211 | CaresAI at BioCreative IX Track 1 — LLM for Biomedical QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a two-stage inference pipeline for precise short-answerextraction to mitigate verbosity and improve alignment with evaluation metrics.Despite partial improvements, challenges persist in generating strictlyformatted outputs. |
Reem Abdel-Salam; Mary Adewunmi; Modinat A. Abayomi; | arxiv-cs.CL | 2025-08-31 |
| 212 | Geospatial Question Answering on Historical Maps Using Spatio-Temporal Knowledge Graphs and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thisproject, we developed a GeoQA system by integrating a spatio-temporal knowledgegraph (KG) constructed from historical map data with large language models(LLMs). |
Ziyi Liu; Sidi Wu; Lorenz Hurni; | arxiv-cs.IR | 2025-08-29 |
| 213 | Benchmarking GPT-5 for Biomedical Natural Language Processing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid expansion of biomedical literature has heightened the need for scalable natural language processing (NLP) solutions. While GPT-4 substantially narrowed the gap with … |
Yu Hou; Zaifu Zhan; Rui Zhang; | ArXiv | 2025-08-28 |
| 214 | Overview of BioASQ 2025: The Thirteenth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This is an overview of the thirteenth edition of the BioASQ challenge in thecontext of the Conference and Labs of the Evaluation Forum (CLEF) 2025. BioASQis a series of … |
ANASTASIOS NENTIDIS et. al. | arxiv-cs.CL | 2025-08-28 |
| 215 | ChainReaction! Structured Approach with Causal Chains As Intermediate Representations for Improved and Explainable Causal Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel, modular framework that explicitlydecouples causal reasoning from answer generation, introducing natural languagecausal chains as interpretable intermediate representations. |
Paritosh Parmar; Eric Peh; Basura Fernando; | arxiv-cs.CV | 2025-08-28 |
| 216 | Overview of BioASQ 2024: The Twelfth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This is an overview of the twelfth edition of the BioASQ challenge in thecontext of the Conference and Labs of the Evaluation Forum (CLEF) 2024. BioASQis a series of international … |
ANASTASIOS NENTIDIS et. al. | arxiv-cs.CL | 2025-08-28 |
| 217 | AI-SearchPlanner: Modular Agentic Search Via Pareto-Optimal Multi-Objective Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose\textbf{AI-SearchPlanner}, a novel reinforcement learning framework designed toenhance the performance of frozen QA models by focusing on search planning.Specifically, our approach introduces three key innovations: 1) Decoupling theArchitecture of the Search Planner and Generator, 2) Dual-Reward Alignment forSearch Planning, and 3) Pareto Optimization of Planning Utility and Cost, toachieve the objectives. |
Lang Mei; Zhihan Yang; Chong Chen; | arxiv-cs.AI | 2025-08-27 |
| 218 | AraHealthQA 2025: The First Shared Task on Arabic Health Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce AraHealthQA 2025, the Comprehensive Arabic Health QuestionAnswering Shared Task, held in conjunction with ArabicNLP 2025 (co-located withEMNLP 2025). |
HASSAN ALHUZALI et. al. | arxiv-cs.CL | 2025-08-27 |
| 219 | Extracting Information from Scientific Literature Via Visual Table Question Answering Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores three approaches to processing table data in scientificpapers to enhance extractive question answering and develop a software tool forthe systematic review process. |
Dongyoun Kim; Hyung-do Choi; Youngsun Jang; John Kim; | arxiv-cs.IR | 2025-08-26 |
| 220 | Knowing or Guessing? Robust Medical Visual Question Answering Via Joint Consistency and Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When evaluatingstate-of-the-art (SOTA) models like LLaVA-Med on RoMed, we observe alarmingperformance drops (e.g., a 40\% decline in Recall) compared to original VQAbenchmarks, exposing critical robustness gaps. |
SONGTAO JIANG et. al. | arxiv-cs.CL | 2025-08-26 |
| 221 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To compileMQAD, our methodology leverages specialized Music Information Retrieval (MIR)models to extract higher-level musical features and Large Language Models(LLMs) to generate natural language QA pairs. |
ZHIHAO OUYANG et. al. | arxiv-cs.SD | 2025-08-26 |
| 222 | Chronological Passage Assembling in RAG Framework for Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, understanding narrative texts requires morethan isolated segments, as the broader context and sequential relationshipsbetween segments are crucial for comprehension. To address these limitations,we propose ChronoRAG, a novel RAG framework specialized for narrative texts.This approach focuses on two essential aspects: refining dispersed documentinformation into coherent and structured passages and preserving narrative flowby explicitly capturing and maintaining the temporal order among retrievedpassages. |
Byeongjeong Kim; Jeonghyun Park; Joonho Yang; Hwanhee Lee; | arxiv-cs.CL | 2025-08-26 |
| 223 | Can Out-of-Distribution Evaluations Uncover Reliance on Shortcuts? A Case Study in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despitetheir practicality, such evaluations build upon a strong assumption: that OODevaluations can capture and reflect upon possible failures in a real-worlddeployment. In this work, we challenge this assumption and confront the results obtainedfrom OOD evaluations with a set of specific failure modes documented inexisting question-answering (QA) models, referred to as a reliance on spuriousfeatures or prediction shortcuts. |
Michal Štefánik; Timothee Mickus; Marek Kadlčík; Michal Spiegel; Josef Kuchař; | arxiv-cs.CL | 2025-08-25 |
| 224 | ST-Raptor: LLM-Powered Semi-Structured Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Second, methods likeNL2Code and multi-modal LLM QA struggle to understand the complex layouts ofsemi-structured tables and cannot accurately answer corresponding questions. Tothis end, we propose ST-Raptor, a tree-based framework for semi-structuredtable question answering using large language models. |
ZIRUI TANG et. al. | arxiv-cs.AI | 2025-08-25 |
| 225 | AVAM: Universal Training-free Adaptive Visual Anchoring Embedded Into Multimodal Large Language Model for Multi-image Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper,we propose a straightforward yet universal Adaptive Visual Anchoring strategy,which can be seamlessly integrated into existing MLLMs, offering significantaccuracy improvements through adaptive compression. |
Kang Zeng; Guojin Zhong; Jintao Cheng; Jin Yuan; Zhiyong Li; | arxiv-cs.CV | 2025-08-25 |
| 226 | Agri-Query: A Case Study on RAG Vs. Long-Context LLMs for Cross-Lingual Technical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a case study evaluating large language models (LLMs) with128K-token context windows on a technical question answering (QA) task. |
Julius Gun; Timo Oksanen; | arxiv-cs.CL | 2025-08-25 |
| 227 | Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Omne-R1, a novel approach designed to enhance multi-hopquestion answering capabilities on schema-free knowledge graphs by integratingadvanced reasoning models. |
BOYUAN LIU et. al. | arxiv-cs.CL | 2025-08-24 |
| 228 | PlantVillageVQA: A Visual Question Answering Dataset for Benchmarking Vision-Language Models in Plant Science Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective remains to provide a publicly available, standardisedand expert-verified database to enhance diagnostic accuracy for plant diseaseidentifications and advance scientific research in the agricultural domain. |
Syed Nazmus Sakib; Nafiul Haque; Mohammad Zabed Hossain; Shifat E. Arman; | arxiv-cs.CV | 2025-08-23 |
| 229 | PediatricsMQA: A Multi-modal Pediatrics Question Answering Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Evaluatingstate-of-the-art open models, we find dramatic performance drops in youngercohorts, highlighting the need for age-aware methods to ensure equitable AIsupport in pediatric care. |
Adil Bahaj; Oumaima Fadi; Mohamed Chetouani; Mounir Ghogho; | arxiv-cs.CY | 2025-08-22 |
| 230 | MizanQA: Benchmarking Large Language Models on Moroccan Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MizanQA (pronounced Mizan,meaning scale in Arabic, a universal symbol of justice), a benchmark designedto evaluate LLMs on Moroccan legal question answering (QA) tasks, characterisedby rich linguistic and legal complexity. |
Adil Bahaj; Mounir Ghogho; | arxiv-cs.CL | 2025-08-22 |
| 231 | Hierarchical Vision-Language Reasoning for Multimodal Multiple-Choice Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multimodal Large Language Models (MLLMs) have demonstrated remarkablemultimodal understanding capabilities in Visual Question Answering (VQA) tasksby integrating visual and textual features. |
AO ZHOU et. al. | arxiv-cs.IR | 2025-08-22 |
| 232 | XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address theselimitations, we introduce XLQA, a novel benchmark explicitly designed forlocale-sensitive multilingual ODQA. |
Keon-Woo Roh; Yeong-Joon Ju; Seong-Whan Lee; | arxiv-cs.CL | 2025-08-22 |
| 233 | MedQARo: A Large-Scale Benchmark for Medical Question Answering in Romanian Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Question answering (QA) is an actively studied topic, being a core naturallanguage processing (NLP) task that needs to be addressed before achievingArtificial General Intelligence … |
ANA-CRISTINA ROGOZ et. al. | arxiv-cs.CL | 2025-08-22 |
| 234 | M3TQA: Massively Multilingual Multitask Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multilingualtable benchmarks suffer from geolinguistic imbalance – overrepresenting certainlanguages and lacking sufficient scale for rigorous cross-lingual analysis. Toaddress these limitations, we introduce a comprehensive framework for massivelymultilingual multitask table question answering, featuring m3TQA-Instruct, alarge-scale benchmark spanning 97 languages across diverse language families,including underrepresented and low-resource languages. |
DAIXIN SHU et. al. | arxiv-cs.CL | 2025-08-22 |
| 235 | Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We further argue that knowledge acquisitionshould be progressive, mirroring human learning: first understanding concepts,then applying them to complex reasoning. To address this, we propose Selct2Know(S2K), a cost-effective framework that internalizes domain knowledge through aninternal-external knowledge self-selection strategy and selective supervisedfine-tuning. |
BOLEI HE et. al. | arxiv-cs.CL | 2025-08-20 |
| 236 | MedCoT-RAG: Causal Chain-of-Thought RAG for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MedCoT-RAG, a domain-specific frameworkthat combines causal-aware document retrieval with structured chain-of-thoughtprompting tailored to medical workflows. |
Ziyu Wang; Elahe Khatibi; Amir M. Rahmani; | arxiv-cs.CL | 2025-08-20 |
| 237 | Towards LLM-generated Explanations for Component-based Knowledge Graph Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the explainability of component-basedsystems for Question Answering (QA). |
Dennis Schiese; Aleksandr Perevalov; Andreas Both; | arxiv-cs.SE | 2025-08-20 |
| 238 | AdaDocVQA: Adaptive Framework for Long Document Visual Question Answering in Low-Resource Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ablationstudies confirm meaningful contributions from each component, and our frameworkestablishes new state-of-the-art results for Japanese document VQA whileproviding a scalable foundation for other low-resource languages andspecialized domains. |
Haoxuan Li; Wei Song; Aofan Liu; Peiwu Qin; | arxiv-cs.CL | 2025-08-19 |
| 239 | Interactive Query Answering on Knowledge Graphs with Soft Entity Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In practice, many real-world queries involve constraintsthat are inherently vague or context-dependent, such as preferences forattributes or related categories. Addressing this gap, we introduce the problemof query answering with soft constraints. |
DANIEL DAZA et. al. | arxiv-cs.AI | 2025-08-19 |
| 240 | Mitigating Easy Option Bias in Multiple-Choice Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this early study, we observe an Easy-Options Bias (EOB) issue in somemultiple-choice Visual Question Answering (VQA) benchmarks such as MMStar,RealWorldQA, SEED-Bench, Next-QA, STAR benchmark and Video-MME. |
Hao Zhang; Chen Li; Basura Fernando; | arxiv-cs.CV | 2025-08-18 |
| 241 | Consensus or Conflict? Fine-Grained Evaluation of Conflicting Answers in Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have demonstrated strong performance in questionanswering (QA) tasks. |
Eviatar Nachshoni; Arie Cattan; Shmuel Amar; Ori Shapira; Ido Dagan; | arxiv-cs.CL | 2025-08-17 |
| 242 | Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Natural language explanations in visual question answering (VQA-NLE) aim tomake black-box models more transparent by elucidating their decision-makingprocesses. |
Yahsin Yeh; Yilun Wu; Bokai Ruan; Honghan Shuai; | arxiv-cs.CV | 2025-08-17 |
| 243 | Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel RAG approach calledHGRAG for MHQA that achieves cross-granularity integration of structural andsemantic information via hypergraphs. |
Changjian Wang; Weihong Deng; Weili Guan; Quan Lu; Ning Jiang; | arxiv-cs.CL | 2025-08-15 |
| 244 | MobQA: A Benchmark Dataset for Semantic Understanding of Human Mobility Data Through Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents MobQA, a benchmark dataset designed to evaluate thesemantic understanding capabilities of large language models (LLMs) for humanmobility data through natural language question answering. |
Hikaru Asano; Hiroki Ouchi; Akira Kasuga; Ryo Yonetani; | arxiv-cs.CL | 2025-08-14 |
| 245 | Learning from Natural Language Feedback for Personalized Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce VAC, a novel framework forpersonalized response generation that replaces scalar rewards with naturallanguage feedback (NLF) that are generated conditioned on the user profiles andthe question narratives. |
Alireza Salemi; Hamed Zamani; | arxiv-cs.CL | 2025-08-14 |
| 246 | Medico 2025: Visual Question Answering for Gastrointestinal Imaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Medico 2025 challenge addresses Visual Question Answering (VQA) forGastrointestinal (GI) imaging, organized as part of the MediaEval task series.The challenge focuses on developing Explainable Artificial Intelligence (XAI)models that answer clinically relevant questions based on GI endoscopy imageswhile providing interpretable justifications aligned with medical reasoning. |
Sushant Gautam; Vajira Thambawita; Michael Riegler; Pål Halvorsen; Steven Hicks; | arxiv-cs.CV | 2025-08-14 |
| 247 | STRIDE-QA: Visual Question Answering Dataset for Spatiotemporal Reasoning in Urban Driving Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision-Language Models (VLMs) have been applied to autonomous driving tosupport decision-making in complex real-world scenarios. |
Keishi Ishihara; Kento Sasaki; Tsubasa Takahashi; Daiki Shiono; Yu Yamaguchi; | arxiv-cs.CV | 2025-08-14 |
| 248 | EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hope EgoCross andour accompanying analysis will serve as a foundation for advancingdomain-adaptive, robust egocentric video understanding. |
YANJUN LI et. al. | arxiv-cs.CV | 2025-08-14 |
| 249 | BERT-VQA: Visual Question Answering on Plots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Visual question answering has been an exciting challenge in the field ofnatural language understanding, as it requires deep learning models to exchangeinformation from both vision and language domains. In this project, we aim totackle a subtask of this problem, namely visual question answering on plots. |
Tai Vu; Robert Yang; | arxiv-cs.LG | 2025-08-13 |
| 250 | Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel retrieval-augmented generation (RAG) framework tailoredfor multihop question answering. |
Seokgi Lee; | arxiv-cs.CL | 2025-08-13 |
| 251 | RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Regulatory compliance question answering (QA) requires precise, verifiableinformation, and domain-specific expertise, posing challenges for LargeLanguage Models (LLMs). In this work, we present a novel multi-agent frameworkthat integrates a Knowledge Graph (KG) of Regulatory triplets withRetrieval-Augmented Generation (RAG) to address these demands. |
Bhavik Agarwal; Hemant Sunil Jomraj; Simone Kaplunov; Jack Krolick; Viktoria Rojkova; | arxiv-cs.AI | 2025-08-13 |
| 252 | LyS at SemEval 2025 Task 8: Zero-Shot Code Generation for Tabular QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our participation in SemEval 2025 Task 8, focused onTabular Question Answering. |
Adrián Gude; Roi Santos-Ríos; Francisco Prado-Valiño; Ana Ezquerro; Jesús Vilares; | arxiv-cs.CL | 2025-08-12 |
| 253 | RSVLM-QA: A Benchmark Dataset for Remote Sensing Vision Language Model-based Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces RSVLM-QAdataset, a new large-scale, content-rich VQA dataset for the RS domain.RSVLM-QA is constructed by integrating data from several prominent RSsegmentation and detection datasets: WHU, LoveDA, INRIA, and iSAID. |
XING ZI et. al. | arxiv-cs.CV | 2025-08-11 |
| 254 | Capabilities of GPT-5 on Multimodal Medical Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A representative case study demonstrates GPT-5’s ability tointegrate visual and textual cues into a coherent diagnostic reasoning chain,recommending appropriate high-stakes interventions. |
Shansong Wang; Mingzhe Hu; Qiang Li; Mojtaba Safari; Xiaofeng Yang; | arxiv-cs.CL | 2025-08-11 |
| 255 | VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing benchmarks focus on English-onlydocument retrieval or only consider multilingual question-answering on asingle-page image. To bridge this gap, we introduce VisR-Bench, a multilingualbenchmark designed for question-driven multimodal retrieval in long documents.Our benchmark comprises over 35K high-quality QA pairs across 1.2K documents,enabling fine-grained evaluation of multimodal retrieval. |
JIAN CHEN et. al. | arxiv-cs.CV | 2025-08-10 |
| 256 | HealthBranches: Synthesizing Clinically-Grounded Question Answering Datasets Via Decision Pathways Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: HealthBranches is a novel benchmark dataset for medical Question-Answering(Q&A), specifically designed to evaluate complex reasoning in Large LanguageModels (LLMs). This dataset … |
CRISTIAN COSENTINO et. al. | arxiv-cs.CL | 2025-08-10 |
| 257 | ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, no known study tests the LLMs’ robustnesswhen presented with obfuscated versions of questions. To systematicallyevaluate these limitations, we propose a novel technique, ObfusQAte and,leveraging the same, introduce ObfusQA, a comprehensive, first of its kind,framework with multi-tiered obfuscation levels designed to examine LLMcapabilities across three distinct dimensions: (i) Named-Entity Indirection,(ii) Distractor Indirection, and (iii) Contextual Overload. |
Shubhra Ghosh; Abhilekh Borah; Aditya Kumar Guru; Kripabandhu Ghosh; | arxiv-cs.CL | 2025-08-10 |
| 258 | BharatBBQ: A Multilingual Bias Benchmark for Question Answering in The Indian Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Evaluating social biases in language models (LMs) is crucial for ensuringfairness and minimizing the reinforcement of harmful stereotypes in AI systems.Existing benchmarks, such as the Bias Benchmark for Question Answering (BBQ),primarily focus on Western contexts, limiting their applicability to the Indiancontext. To address this gap, we introduce BharatBBQ, a culturally adaptedbenchmark designed to assess biases in Hindi, English, Marathi, Bengali, Tamil,Telugu, Odia, and Assamese. |
Aditya Tomar; Nihar Ranjan Sahoo; Pushpak Bhattacharyya; | arxiv-cs.CL | 2025-08-09 |
| 259 | Two-Stage Quranic QA Via Ensemble Retrieval and Instruction-Tuned Answer Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inthis paper, we propose a novel two-stage framework that addresses both passageretrieval and answer extraction. |
Mohamed Basem; Islam Oshallah; Ali Hamdi; Khaled Shaban; Hozaifa Kassab; | arxiv-cs.CL | 2025-08-09 |
| 260 | SafePLUG: Empowering Multimodal LLMs with Pixel-Level Insight and Temporal Grounding for Traffic Accident Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing MLLMs in this domainprimarily focus on coarse-grained image-level or video-level comprehension andoften struggle to handle fine-grained visual details or localized scenecomponents, limiting their applicability in complex accident scenarios. Toaddress these limitations, we propose SafePLUG, a novel framework that empowersMLLMs with both Pixel-Level Understanding and temporal Grounding forcomprehensive traffic accident analysis. |
ZIHAO SHENG et. al. | arxiv-cs.CV | 2025-08-08 |
| 261 | Harnessing Adaptive Topology Representations for Zero-Shot Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Built on these, we develop the DynamicTRF framework, which aims toimprove both the accuracy and conciseness of graph QA. |
YANBIN WEI et. al. | arxiv-cs.CL | 2025-08-08 |
| 262 | QA-Dragon: Query-Aware Dynamic RAG System for Knowledge-Intensive Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address thislimitation, we propose QA-Dragon, a Query-Aware Dynamic RAG System forKnowledge-Intensive VQA. |
Zhuohang Jiang; Pangjing Wu; Xu Yuan; Wenqi Fan; Qing Li; | arxiv-cs.AI | 2025-08-07 |
| 263 | Conformal P-Value in Multiple-Choice Question Answering Tasks with Provable Risk Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a significance testing-enhanced conformal prediction(CP) framework to improve trustworthiness of large language models (LLMs) inmultiple-choice question answering (MCQA). |
Yuanchang Ye; | arxiv-cs.CL | 2025-08-07 |
| 264 | MKG-RAG: Multimodal Knowledge Graph-Enhanced RAG for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome these challenges, a promising solution is to integratemultimodal knowledge graphs (KGs) into RAG-based VQA frameworks to enhance thegeneration by introducing structured multimodal knowledge. Therefore, in thispaper, we propose a novel multimodal knowledge-augmented generation framework(mKG-RAG) based on multimodal KGs for knowledge-intensive VQA tasks.Specifically, our approach leverages MLLM-powered keyword extraction andvision-text matching to distill semantically consistent and modality-alignedentities/relationships from multimodal documents, constructing high-qualitymultimodal KGs as structured knowledge representations. |
Xu Yuan; Liangbo Ning; Wenqi Fan; Qing Li; | arxiv-cs.CV | 2025-08-07 |
| 265 | Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble During Multi-Hop Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel, nuancederror categorization framework that examines failures across three criticaldimensions: the diversity and uniqueness of source documents involved (hops),completeness in capturing relevant information (coverage), and cognitiveinefficiency (overthinking). |
ANUSHKA YADAV et. al. | arxiv-cs.CL | 2025-08-06 |
| 266 | Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While recent approaches have explored text-based chain-of-thought(CoT) reasoning for MLLMs, these methods often suffer from limited cross-modalinteraction and increased hallucination, especially with longer videos orreasoning chains. To address these challenges, we propose Video Intelligencevia Tool-Augmented Learning (VITAL), a novel end-to-end agentic video reasoningframework. |
HAOJI ZHANG et. al. | arxiv-cs.CV | 2025-08-06 |
| 267 | An Entity Linking Agent for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an entity linking agent for QA, based on a Large LanguageModel that simulates human cognitive workflows. |
YAJIE LUO et. al. | arxiv-cs.CL | 2025-08-05 |
| 268 | CF-RAG: A Dataset and Method for Carbon Footprint QA Using Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we tacklethe challenge of answering questions related to carbon footprints withinsustainability reports available in PDF format. |
Kaiwen Zhao; Bharathan Balaji; Stephen Lee; | arxiv-cs.CL | 2025-08-05 |
| 269 | OpenLifelogQA: An Open-Ended Multi-Modal Lifelog Question-Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel lifelogQA dataset called OpenLifelogQA, building upon an 18-month lifelog dataset. |
Quang-Linh Tran; Binh Nguyen; Gareth J. F. Jones; Cathal Gurrin; | arxiv-cs.MM | 2025-08-05 |
| 270 | Domain-Specific Fine-Tuning and Prompt-Based Learning: A Comparative Study for Developing Natural Language-Based BIM Information Retrieval Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presentsa comparative analysis of two prominent approaches for developing NLI-based BIMinformation retrieval systems: domain-specific fine-tuning and prompt-basedlearning using large language models (LLMs). |
Han Gao; Timo Hartmann; Botao Zhong; Kai Lia; Hanbin Luo; | arxiv-cs.IR | 2025-08-05 |
| 271 | Evaluating Variance in Visual Question Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their advancements,the evaluation of MLLMs on VQA benchmarks often relies on point estimates,overlooking the significant variance in performance caused by factors such asstochastic model outputs, training seed sensitivity, and hyperparameterconfigurations. This paper critically examines these issues by analyzingvariance across 14 widely used VQA benchmarks, covering diverse tasks such asvisual reasoning, text understanding, and commonsense reasoning. |
Nikitha SR; | arxiv-cs.CV | 2025-08-04 |
| 272 | A Multi-Agent System for Complex Reasoning in Radiology Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a multi-agent system (MAS) designed to supportcomplex reasoning in RVQA, with specialized agents for context understanding,multimodal reasoning, and answer validation. |
Ziruo Yi; Jinyu Liu; Ting Xiao; Mark V. Albert; | arxiv-cs.AI | 2025-08-04 |
| 273 | SustainableQA: A Comprehensive Question Answering Dataset for Corporate Sustainability and EU Taxonomy Reporting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The growing demand for corporate sustainability transparency, particularlyunder new regulations like the EU Taxonomy, necessitates precise dataextraction from large, unstructured corporate reports, a task for which LargeLanguage Models and Retrieval-RAG systems require high-quality, domain-specificquestion-answering datasets. To address this, we introduce SustainableQA, anovel dataset and a scalable pipeline that generates comprehensive QA pairsfrom corporate sustainability and annual reports by integrating semantic chunkclassification, a hybrid span extraction pipeline, and a specializedtable-to-paragraph transformation. |
Mohammed Ali; Abdelrahman Abdallah; Adam Jatowt; | arxiv-cs.IR | 2025-08-04 |
| 274 | Enhancing Long Video Question Answering with Scene-Localized Frame Grouping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing frameworksand evaluation tasks focus on identifying specific frames containing coreobjects from a large number of irrelevant frames, which does not align with thepractical needs of real-world applications. To address this issue, we propose anew scenario under the video question-answering task, SceneQA, which emphasizesscene-based detail perception and reasoning abilities. |
XUYI YANG et. al. | arxiv-cs.CV | 2025-08-04 |
| 275 | Contextually Aware E-Commerce Product Question Answering Using RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing Product Question Answering (PQA) systems oftenfail to utilize rich user context and diverse product information effectively.We propose a scalable, end-to-end framework for e-commerce PQA using RetrievalAugmented Generation (RAG) that deeply integrates contextual understanding. |
Praveen Tangarajan; Anand A. Rajasekar; Manish Rathi; Vinay Rao Dandin; Ozan Ersoy; | arxiv-cs.CL | 2025-08-03 |
| 276 | Harnessing Collective Intelligence of LLMs for Robust Biomedical QA: A Multi-Model Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present our participation in the 13thedition of the BioASQ challenge, which involves biomedical semanticquestion-answering for Task 13b and biomedical question-answering fordeveloping topics for the Synergy task. |
Dimitra Panou; Alexandros C. Dimopoulos; Manolis Koubarakis; Martin Reczko; | arxiv-cs.CL | 2025-08-02 |
| 277 | D-SCoRE: Document-Centric Segmentation and CoT Reasoning with Structured Export for QA-CoT Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The scarcity and high cost of high-quality question-answering (QA) datasetshinder supervised fine-tuning (SFT) for domain-specific large language models(LLMs). To address this, we introduce D-SCoRE, a training-free pipeline thatutilizes LLMs and prompt engineering to produce diverse, high-quality QAdatasets from arbitrary textual sources. |
Weibo Zhou; Lingbo Li; Shangsong Liang; | arxiv-cs.CL | 2025-08-02 |
| 278 | From Query to Logic: Ontology-Driven Multi-Hop Reasoning in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs), despite their success in question answering,exhibit limitations in complex multi-hop question answering (MQA) tasks thatnecessitate non-linear, structured reasoning. This limitation stems from theirinability to adequately capture deep conceptual relationships between entities.To overcome this challenge, we present **ORACLE** (**O**ntology-driven**R**easoning **A**nd **C**hain for **L**ogical **E**ucidation), atraining-free framework that combines LLMs’ generative capabilities with thestructural benefits of knowledge graphs. |
HAONAN BIAN et. al. | arxiv-cs.CL | 2025-08-02 |
| 279 | Prompting Large Language Models with Partial Knowledge for Answering Questions with Unseen Entities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Contrary to theconventional view, we propose a new perspective: LLMs can be awakened viapartially relevant knowledge already embedded in LLMs. |
ZHICHAO YAN et. al. | arxiv-cs.CL | 2025-08-02 |
| 280 | DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the former lacks adaptabilitydue to static path extraction and the absence of contextual refinement, whilethe latter suffers from high computational costs and limited evaluationaccuracy because of their dependence on fixed scoring functions and repeatedLLM calls. To address these issues, this paper proposes Dynamically AdaptiveMCTS-based Reasoning (DAMR), a novel framework that integrates LLM-guided MonteCarlo Tree Search (MCTS) with adaptive path evaluation to enable efficient andcontext-aware KGQA. |
YINGXU WANG et. al. | arxiv-cs.CL | 2025-08-01 |
| 281 | ITUNLP at SemEval-2025 Task 8: Question-Answering Over Tabular Data: A Zero-Shot Approach Using LLM-Driven Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our system for SemEval-2025 Task 8: DataBench,Question-Answering over Tabular Data. |
Atakan Site; Emre Hakan Erdemir; Gülşen Eryiğit; | arxiv-cs.CL | 2025-08-01 |
| 282 | MHier-RAG: Multi-Modal RAG for Visual-Rich Document Question-Answering Via Hierarchical and Multi-Granularity Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the former were susceptible to hallucinations,while the latter struggled for inter-modal disconnection and cross-pagefragmentation. To address these challenges, a novel multi-modal RAG model,named MHier-RAG, was proposed, leveraging both textual and visual informationacross long-range pages to facilitate accurate question answering forvisual-rich documents. |
Ziyu Gong; Chengcheng Mai; Yihua Huang; | arxiv-cs.MM | 2025-08-01 |
| 283 | Demo: TOSense — What Did You Just Agree To? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To avoid expensive manualannotation, we present a novel Question Answering Evaluation Pipeline (QEP)that generates synthetic questions and verifies the correctness of answersusing clustered topic matching. |
Xinzhang Chen; Hassan Ali; Arash Shaghaghi; Salil S. Kanhere; Sanjay Jha; | arxiv-cs.CR | 2025-08-01 |
| 284 | Agentic Large Language Models Improve Retrieval-based Radiology Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose radiology Retrievaland Reasoning (RaR), a multi-step retrieval and reasoning framework designed toimprove diagnostic accuracy, factual consistency, and clinical reliability ofLLMs in radiology question answering. |
SEBASTIAN WIND et. al. | arxiv-cs.CL | 2025-08-01 |
| 285 | Cascaded Information Disclosure for Generalized Evaluation of Problem Solving Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While question-answering~(QA) benchmark performance is an automatic andscalable method to compare LLMs, it is an indirect method of evaluating theirunderlying problem-solving capabilities. Therefore, we propose a holistic andgeneralizable framework based on \emph{cascaded question disclosure} thatprovides a more accurate estimate of the models’ problem-solving capabilitieswhile maintaining the scalability and automation. |
Yunxiang Yan; Tomohiro Sawada; Kartik Goyal; | arxiv-cs.CL | 2025-07-31 |
| 286 | A Benchmark Dataset and Evaluation Framework for Vietnamese Large Language Models in Customer Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address thisgap, we introduce the Customer Support Conversations Dataset (CSConDa), acurated benchmark of over 9,000 QA pairs drawn from real interactions withhuman advisors at a large Vietnamese software company. |
LONG S. T. NGUYEN et. al. | arxiv-cs.CL | 2025-07-30 |
| 287 | CUS-QA: Local-Knowledge-Oriented Open-Ended Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CUS-QA, a benchmark for open-ended regional question answeringthat encompasses both textual and visual modalities. |
Jindřich Libovický; Jindřich Helcl; Andrei Manea; Gianluca Vico; | arxiv-cs.CL | 2025-07-30 |
| 288 | Exploring The Application of Visual Question Answering (VQA) for Classroom Activity Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the applicability of severalstate-of-the-art open-source VQA models, including LLaMA2, LLaMA3, QWEN3, andNVILA, in the context of classroom behavior analysis. |
SINH TRONG VU et. al. | arxiv-cs.CV | 2025-07-30 |
| 289 | Solution for Meta KDD Cup’25: A Comprehensive Three-Step Framework for Vision Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thispaper describes the solutions of all tasks in Meta KDD Cup’25 from BlackPearlteam. |
Zijian Zhang; Xiaocheng Zhang; Yang Zhou; Zhimin Lin; Peng Yan; | arxiv-cs.IR | 2025-07-29 |
| 290 | Knowledge Editing for Multi-Hop Question Answering Using Semantic Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a knowledge editor for MQA based onsemantic analysis called CHECK. |
Dominic Simon; Rickard Ewetz; | arxiv-cs.AI | 2025-07-29 |
| 291 | Analyzing The Sensitivity of Vision Language Models in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we explore the sensitivity of VisionLanguage Models (VLMs) through the lens of cooperative principles ofconversation proposed by Grice. |
Monika Shah; Sudarshan Balaji; Somdeb Sarkhel; Sanorita Dey; Deepak Venugopal; | arxiv-cs.CV | 2025-07-28 |
| 292 | MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces MediQAl, a French medical question answering datasetdesigned to evaluate the capabilities of language models in factual medicalrecall and reasoning over real-world clinical scenarios. |
Adrien Bazoge; | arxiv-cs.CL | 2025-07-28 |
| 293 | Shapley Uncertainty in Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It primarily relies on setting threshold tomeasure the level of semantic equivalence relation. We propose a more nuancedframework that extends beyond such thresholding by developing a Shapley-baseduncertainty metric that captures the continuous nature of semanticrelationships. |
Meilin Zhu; Gaojie Jin; Xiaowei Huang; Lijun Zhang; | arxiv-cs.AI | 2025-07-28 |
| 294 | RoD-TAL: A Benchmark for Answering Questions in Romanian Driving License Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The intersection of AI and legal systems presents a growing need for toolsthat support legal education, particularly in under-resourced languages such asRomanian. In this work, we aim to evaluate the capabilities of Large LanguageModels (LLMs) and Vision-Language Models (VLMs) in understanding and reasoningabout Romanian driving law through textual and visual question-answering tasks.To facilitate this, we introduce RoD-TAL, a novel multimodal dataset comprisingRomanian driving test questions, text-based and image-based, alongsideannotated legal references and human explanations. |
ANDREI VLAD MAN et. al. | arxiv-cs.CL | 2025-07-25 |
| 295 | A Graph-based Approach for Multi-Modal Question Answering from Flowcharts in Telecom Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the end-to-end approach from processing technical documents,classifying image types, building graph representations, and incorporating themwith the text embedding pipeline for efficient retrieval. |
SUMIT SOMAN et. al. | arxiv-cs.CL | 2025-07-25 |
| 296 | PDB-Eval: An Evaluation of Large Multimodal Models for Description and Explanation of Personalized Driving Behavior Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a benchmark, PDB-Eval, for a detailedunderstanding of Personalized Driver Behavior, and aligning Large MultimodalModels (MLLMs) with driving comprehension and reasoning. |
JUNDA WU et. al. | arxiv-cs.CV | 2025-07-24 |
| 297 | TyDi QA-WANA: A Benchmark for Information-Seeking Question Answering in Languages of West Asia and North Africa Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TyDi QA-WANA, a question-answering dataset consisting of 28Kexamples divided among 10 language varieties of western Asia and northernAfrica. |
Parker Riley; Siamak Shakeri; Waleed Ammar; Jonathan H. Clark; | arxiv-cs.CL | 2025-07-23 |
| 298 | Leveraging Synthetic Data for Question Answering with Multilingual LLMs in The Agricultural Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Publicly available general-purpose Large Language Models(LLMs) typically offer generic agriculture advisories, lacking precision inlocal and multilingual contexts. |
RISHEMJIT KAUR et. al. | arxiv-cs.CL | 2025-07-22 |
| 299 | GG-BBQ: German Gender Bias Benchmark for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our work, we evaluate gender bias in German LargeLanguage Models (LLMs) using the Bias Benchmark for Question Answering byParrish et al. (2022) as a reference. |
SHALAKA SATHEESH et. al. | arxiv-cs.CL | 2025-07-22 |
| 300 | NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose NeuSym-RAG, a hybrid neural symbolic retrieval framework which combines both paradigms in an interactive process. |
RUISHENG CAO et. al. | acl | 2025-07-21 |
| 301 | Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering (Q-DREAM). |
LINHAO YE et. al. | acl | 2025-07-21 |
| 302 | Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current benchmarks overlook these hidden reasoning attributes, making it difficult to assess a model’s specific levels of commonsense knowledge and reasoning ability. To address this issue, we introduce ReComSBench, a novel framework that reveals hidden reasoning attributes behind commonsense questions by leveraging the knowledge generated during the reasoning process. |
Huijun Lian; Zekai Sun; Keqi Chen; Yingming Gao; Ya Li; | acl | 2025-07-21 |
| 303 | MRAKL: Multilingual Retrieval-Augmented Knowledge Graph Construction for Low-Resourced Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we reformulate themKGC task as a Question Answering (QA) task and introduce mRAKL: aRetrieval-Augmented Generation (RAG) based system to perform mKGC. |
Hellina Hailu Nigatu; Min Li; Maartje ter Hoeve; Saloni Potdar; Sarah Chasins; | arxiv-cs.CL | 2025-07-21 |
| 304 | Grounded, or A Good Guesser? A Per-Question Balanced Dataset to Separate Blind from Grounded Models for Embodied Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous work has demonstrated that blind language models (which do not incorporate perception, but predict an answer based solely on the question text) are a strong baseline for existing benchmarks, even compared against state-of-the-art vision and language models. To determine whether a model is grounding its answers in its specific environment, rather than relying on a language model’s expectations about the world generally, we propose PQB-EQA, a *per-question balanced* EQA dataset. |
MILES SHELTON et. al. | acl | 2025-07-21 |
| 305 | CSTree-SRI: Introspection-Driven Cognitive Semantic Tree for Multi-Turn Question Answering Over Extra-Long Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the challenges, we propose the CSTree-SRI framework (Cognitive Semantic Tree through Summarization, Retrieval, and Introspection). |
ZHAOWEN WANG et. al. | acl | 2025-07-21 |
| 306 | Can LLMs Evaluate Complex Attribution in QA? Automatic Benchmarking Using Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Attributed Question Answering (AQA) has attracted wide attention, but there are still several limitations in evaluating the attributions, including lacking fine-grained attribution categories, relying on manual annotations, and failing to compare attributions with only subtle differences. To bridge these gaps, we introduce Complex Attributed Question Answering (CAQA), a large-scale benchmark containing comprehensive attribution categories, automatically generated using Knowledge Graphs (KGs), and complex attribution scenarios. |
NAN HU et. al. | acl | 2025-07-21 |
| 307 | BELLE: A Bi-Level Multi-Agent Reasoning Framework for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose a Bi-levEL muLti-agEnt reasoning (BELLE) framework to address multi-hop QA by specifically focusing on the correspondence between question types and methods, where each type of method is regarded as an ”operator” by prompting LLMs differently. |
Taolin Zhang; Dongyang Li; Qizhou Chen; Chengyu Wang; Xiaofeng He; | acl | 2025-07-21 |
| 308 | CaLMQA: Exploring Culturally Specific Long-form Question Answering Across 23 Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We define culturally specific questions as those that refer to concepts unique to one or a few cultures, or have different answers depending on the cultural or regional context. We obtain these questions by crawling naturally-occurring questions from community web forums in high-resource languages, and by hiring native speakers to write questions in under-resourced, rarely-studied languages such as Fijian and Kirundi. |
SHANE ARORA et. al. | acl | 2025-07-21 |
| 309 | NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, while large language models (LLMs), a popular solution for this task, demonstrate strong reasoning abilities, they struggle with the domain-specific complexities of personalized healthy dietary reasoning, and existing benchmarks fail to capture these challenges. To address these gaps, we introduce the Nutritional Graph Question Answering (NGQA) benchmark, the first graph question answering dataset designed for personalized nutritional health reasoning. |
ZHEYUAN ZHANG et. al. | acl | 2025-07-21 |
| 310 | Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore how LMs respond to multi-hop questions by permuting search results (retrieved documents) under various configurations. |
Wenyu Huang; Pavlos Vougiouklis; Mirella Lapata; Jeff Z. Pan; | acl | 2025-07-21 |
| 311 | YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce YESciEval, an open-source framework that combines fine-grained rubric-based assessment with reinforcement learning to mitigate optimism bias in LLM evaluators. |
Jennifer D’Souza; Hamed Babaei Giglou; Quentin Münch; | acl | 2025-07-21 |
| 312 | Ontology-Guided Reverse Thinking Makes Large Language Models Stronger on Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result, it is difficult to establish reasoning paths to the purpose, which leads to information loss and redundancy. To address this issue, inspired by human reverse thinking, we propose Ontology-Guided Reverse Thinking (ORT), a novel framework that constructs reasoning paths from purposes back to conditions. |
RUNXUAN LIU et. al. | acl | 2025-07-21 |
| 313 | Reasoning Models Are Test Exploiters: Rethinking Multiple-Choice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For eachmodel–benchmark pair, we considered 5 ways of presenting the model withquestions, including variations on whether multiple choices were offered to themodel at all; whether none of the above sometimes replaced the right answer;and whether the model was permitted to perform chain-of-thought reasoningbefore and/or after the choices were presented. |
Narun Raman; Taylor Lundy; Kevin Leyton-Brown; | arxiv-cs.CL | 2025-07-21 |
| 314 | Fast or Slow? Integrating Fast Intuition and Deliberate Thinking for Enhancing Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by Dual Process Theory, which distinguishes between instinctive and deliberate cognitive modes in human reasoning, we propose FOCUS, a plug-and-play approach that dynamically adapts to the complexity of questions, combining fast intuitive judgments with deliberate analytical reasoning to enhance the vision-language reasoning capability of the MLLM. |
Songtao Jiang; Chenyi Zhou; Yan Zhang; Yeying Jin; Zuozhu Liu; | acl | 2025-07-21 |
| 315 | Learning Sparsity for Effective and Efficient Music Performance Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing Music AVQA methods often rely on dense and unoptimized representations, leading to inefficiencies in the isolation of key information, the reduction of redundancy, and the prioritization of critical samples. To address these challenges, we introduce Sparsify, a sparse learning framework specifically designed for Music AVQA. |
XINGJIAN DIAO et. al. | acl | 2025-07-21 |
| 316 | DTCRS: Dynamic Tree Construction for Recursive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce DTCRS, a method that dynamically generates summary trees based on document structure and query semantics. |
Guanran Luo; Zhongquan Jian; Wentao Qiu; Meihong Wang; Qingqiang Wu; | acl | 2025-07-21 |
| 317 | Beyond The Answer: Advancing Multi-Hop QA with Fine-Grained Graph Reasoning and Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We argue that the reasoning process should also be evaluated because wrong reasoning process can also lead to the correct final answers. Motivated by this, we propose a “Planner-Executor-Reasoner” (PER) architecture, which forms the core of the Plan-anchored Data Preprocessing (PER-DP) and the Plan-guided Multi-Hop QA (PER-QA). |
QICHUAN LIU et. al. | acl | 2025-07-21 |
| 318 | Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing methods and datasets remain focused on a narrow spectrum of tasks, such as forecasting or anomaly detection. To bridge this gap, we introduce Time Series Multi-Task Question Answering (Time-MQA), a unified framework that enables natural language queries across multiple time series tasks – numerical analytical tasks and open-ended question answering with reasoning. |
YAXUAN KONG et. al. | acl | 2025-07-21 |
| 319 | ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Dense retrievers typically outperform sparse methods like BM25 by leveraging semantic embeddings in many tasks; however, they require labeled query-document pairs for fine-tuning, which poses a significant challenge in MHQA due to the complexity of the reasoning steps. To overcome this limitation, we introduce Retriever Supervision with Consistency and Relevance (ReSCORE), a novel method for training dense retrievers for MHQA without the need for labeled documents. |
DOSUNG LEE et. al. | acl | 2025-07-21 |
| 320 | AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce AfriMed-QA , the first largescale Pan-African English multi-specialty medical Question-Answering (QA) dataset, 15,000 questions (open and closed-ended) sourced from over 60 medical schools across 16 countries, covering 32 medical specialties. |
CHARLES NIMO et. al. | acl | 2025-07-21 |
| 321 | ComRAG: Retrieval-Augmented Generation with Dynamic Vector Stores for Real-time Community Question Answering in Industry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose ComRAG, a retrieval-augmented generation framework for real-time industrial CQA that integrates static knowledge with dynamic historical QA pairs via a centroid-based memory mechanism designed for retrieval, generation, and efficient storage. |
QINWEN CHEN et. al. | acl | 2025-07-21 |
| 322 | Micro-Act: Mitigate Knowledge Conflict in Question Answering Via Actionable Self-Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches often attempt to mitigate conflicts by directly comparing two knowledge sources in a side-by-side manner, but this can overwhelm LLMs with extraneous or lengthy contexts, ultimately hindering their ability to identify and mitigate inconsistencies. To address this issue, we propose **Micro-Act** a framework with a hierarchical action space that automatically perceives context complexity and adaptively decomposes each knowledge source into a sequence of fine-grained comparisons. |
NAN HUO et. al. | acl | 2025-07-21 |
| 323 | Not All Terms Matter: Recall-Oriented Adaptive Learning for PLM-aided Query Expansion in Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we propose a novel Recall-oriented Adaptive Learning (ReAL) method, which iteratively adjusts the importance weights of QE terms based on their relevance, thereby refining term distinction and enhancing the separation of relevant terms. |
Xinran Chen; Ben He; Xuanang Chen; Le Sun; | acl | 2025-07-21 |
| 324 | QAEncoder: Towards Aligned Representation Learning in Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the inherent gap between user queries and relevant documents hinders precise matching. We introduce QAEncoder, a training-free approach to bridge this gap. |
ZHENGREN WANG et. al. | acl | 2025-07-21 |
| 325 | Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aims to (1) explore whether KGs can make LLMs more trustworthy in an open-ended setting, and (2) conduct a comparative analysis to shed light on method design. |
Yuan Sui; Yufei He; Zifeng Ding; Bryan Hooi; | acl | 2025-07-21 |
| 326 | Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, RAG is not always needed as may introduce irrelevant information. Recent adaptive retrieval methods integrate LLMs’ intrinsic knowledge with external information appealing to LLM self-knowledge, but they often neglect efficiency evaluations and comparisons with uncertainty estimation techniques. |
VIKTOR MOSKVORETSKII et. al. | acl | 2025-07-21 |
| 327 | QAEval: Mixture of Evaluators for Question-Answering Task Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: LLM-based evaluation methods offer greater flexibility but suffer from sensitivity to instructions, robustness issues, and high computational costs. To overcome these challenges, we introduce QAEval, a hybrid framework combining rule-based reliability with LLM-based adaptability. |
TAN YUE et. al. | acl | 2025-07-21 |
| 328 | Doc-React: Multi-page Heterogeneous Document Question-answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by iterative frameworks like ReAct, which refine retrieval through feedback, we propose Doc-React, an adaptive iterative framework that balances information gain and uncertainty reduction at each step. |
JUNDA WU et. al. | acl | 2025-07-21 |
| 329 | Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we identify a critical problem, “lost-in-retrieval”, in retrieval-augmented multi-hop question answering (QA): the key entities are missed in LLMs’ sub-question decomposition. |
Rongzhi Zhu; Xiangyu Liu; Zequn Sun; Yiwei Wang; Wei Hu; | acl | 2025-07-21 |
| 330 | On Synthesizing Data for Context Attribution in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Providing this information is the task of context attribution. In this paper, we systematically study LLM-based approaches for this task, namely we investigate (i) zero-shot inference, (ii) LLM ensembling, and (iii) fine-tuning of small LMs on synthetic data generated by larger LLMs. |
GORJAN RADEVSKI et. al. | acl | 2025-07-21 |
| 331 | Can Large Language Models Accurately Generate Answer Keys for Health-related Questions? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore several approaches to nugget generation for medical question answering and evaluate their alignment with expert human nugget generation. |
Davis Bartels; Deepak Gupta; Dina Demner-Fushman; | acl | 2025-07-21 |
| 332 | How to Compare Things Properly? A Study of Argument Relevance in Comparative Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It poses unique challenges due to the inherently subjective nature of many questions and the need to integrate diverse perspectives. |
IRINA NIKISHINA et. al. | acl | 2025-07-21 |
| 333 | REVISE: A Framework for Revising OCRed Text in Practical Information Systems with Data Contamination Strategy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing approaches primarily focus on solving specific tasks, lacking the capability to structurally organize and systematically manage document information. To address this limitation, we propose Revise, a framework that systematically corrects errors introduced by OCR at the character, word, and structural levels. |
Gyuho Shim; Seongtae Hong; Heuiseok Lim; | acl | 2025-07-21 |
| 334 | A Multi-Agent Framework for Mitigating Dialect Biases in Privacy Policy Question-Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel multi-agent framework inspired by human-centered design principles to mitigate dialectal biases. |
ĐORĐE KLISURA et. al. | acl | 2025-07-21 |
| 335 | EXPLAIN: Enhancing Retrieval-Augmented Generation with Entity Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce EXPLAIN (EXtracting, Pre-summarizing, Linking and enhAcINg RAG), a novel retrieval-augmented generation method that automatically extracts useful entities and generates summaries from documents. |
YAOZHEN LIANG et. al. | acl | 2025-07-21 |
| 336 | RePanda: Pandas-powered Tabular Verification and Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce RePanda, a structured fact verification approach that translates claims into executable pandas queries, enabling interpretable and verifiable reasoning. |
Atoosa Chegini; Keivan Rezaei; Hamid Eghbalzadeh; Soheil Feizi; | acl | 2025-07-21 |
| 337 | Exploiting The Shadows: Unveiling Privacy Leaks Through Lower-Ranked Tokens in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel attack method by exploiting the model’s lower-ranked output tokens to leak sensitive information. |
Yuan Zhou; Zhuo Zhang; Xiangyu Zhang; | acl | 2025-07-21 |
| 338 | InterAct-Video: Reasoning-Rich Video QA for Urban Traffic Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However,existing VideoQA models struggle with the complexity of real-world trafficscenes, where multiple concurrent events unfold across spatiotemporaldimensions. To address these challenges, this paper introduces \textbf{InterActVideoQA}, a curated dataset designed to benchmark and enhance VideoQA modelsfor traffic monitoring tasks. |
JOSEPH RAJ VISHAL et. al. | arxiv-cs.CV | 2025-07-19 |
| 339 | LeAdQA: LLM-Driven Context-Aware Temporal Grounding for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While recent advances in multimodal learninghave improved alignment and fusion, current approaches remain limited by twoprevalent but fundamentally flawed strategies: (1) task-agnostic samplingindiscriminately processes all frames, overwhelming key events with irrelevantcontent; and (2) heuristic retrieval captures superficial patterns but missescausal-temporal structures needed for complex reasoning. To address thesechallenges, we introduce LeAdQA, an innovative approach that bridges these gapsthrough synergizing causal-aware query refinement with fine-grained visualgrounding. |
XINXIN DONG et. al. | arxiv-cs.CV | 2025-07-19 |
| 340 | Team of One: Cracking Complex Video QA with Model Synergy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel framework for open-ended video question answering thatenhances reasoning depth and robustness in complex real-world scenarios, asbenchmarked on the CVRR-ES dataset. |
JUN XIE et. al. | arxiv-cs.CV | 2025-07-18 |
| 341 | SPARQL Query Generation with LLMs: Measuring The Impact of Training Data Memorization and Knowledge Injection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel method thatevaluates the quality of LLMs by generating a SPARQL query from anatural-language question under various conditions: (1) zero-shot SPARQLgeneration, (2) with knowledge injection, and (3) with anonymized knowledgeinjection. |
Aleksandr Gashkov; Aleksandr Perevalov; Maria Eltsova; Andreas Both; | arxiv-cs.IR | 2025-07-18 |
| 342 | Teaching Vision-Language Models to Ask: Resolving Ambiguity in Visual Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome thesechallenges, we introduce \textbf{ClearVQA} benchmark, which targets threecommon categories of ambiguity in VQA context, and encompasses various VQAscenarios. |
Pu Jian; Donglei Yu; Wen Yang; Shuo Ren; Jiajun Zhang; | arxiv-cs.CV | 2025-07-18 |
| 343 | BifrostRAG: Bridging Dual Knowledge Graphs for Multi-Hop Question Answering in Construction Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome this, we introduceBifrostRAG: a dual-graph RAG-integrated system that explicitly models bothlinguistic relationships (via an Entity Network Graph) and document structure(via a Document Navigator Graph). This architecture powers a hybrid retrievalmechanism that combines graph traversal with vector-based semantic search,enabling large language models to reason over both the meaning and thestructure of the text. |
Yuxin Zhang; Xi Wang; Mo Hu; Zhenyu Zhang; | arxiv-cs.AI | 2025-07-17 |
| 344 | FIQ: Fundamental Question Generation with The Integration of Question Embeddings for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a fundamental questiongeneration with the integration of question embeddings for video questionanswering (FIQ), a novel approach designed to strengthen the reasoning abilityof the model by enhancing the fundamental understanding of videos. |
Ju-Young Oh; Ho-Joong Kim; Seong-Whan Lee; | arxiv-cs.CV | 2025-07-17 |
| 345 | COREVQA: A Crowd Observation and Reasoning Entailment Visual Question Answering Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To addressthis, we propose COREVQA (Crowd Observations and Reasoning Entailment), abenchmark of 5608 image and synthetically generated true/false statement pairs,with images derived from the CrowdHuman dataset, to provoke visual entailmentreasoning on challenging crowded images. |
ISHANT CHINTAPATLA et. al. | arxiv-cs.CV | 2025-07-17 |
| 346 | POLYCHARTQA: Benchmarking Large Vision-Language Models with Multilingual Chart Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present PolyChartQA, the first large-scalemultilingual chart question answering benchmark covering 22,606 charts and26,151 question-answering pairs across 10 diverse languages. |
Yichen Xu; Liangyu Chen; Liang Zhang; Wenxuan Wang; Qin Jin; | arxiv-cs.CL | 2025-07-16 |
| 347 | Describe Anything Model for Visual Question Answering on Text-rich Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In such settings, thefine-grained extraction of textual information is crucial to producing correctanswers. Motivated by this, we introduce DAM-QA, a framework with a tailoredevaluation protocol, developed to investigate and harness the region-awarecapabilities from DAM for the text-rich VQA problem that requires reasoningover text-based information within images. |
YEN-LINH VU et. al. | arxiv-cs.CV | 2025-07-16 |
| 348 | The Benefits of Query-based KGQA Systems for Complex and Temporal Questions in LLM Era Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore multi-stage query-based framework for WikiData QA,proposing multi-stage approach that enhances performance on challengingmulti-hop and temporal benchmarks. |
ARTEM ALEKSEEV et. al. | arxiv-cs.CL | 2025-07-16 |
| 349 | 3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the growing need for diverse and scalable data in indoor scene tasks,such as question answering and dense captioning, we propose 3D-MoRe, a novelparadigm designed to generate large-scale 3D-language datasets by leveragingthe strengths of foundational models. |
RONGTAO XU et. al. | arxiv-cs.CV | 2025-07-16 |
| 350 | EsBBQ and CaBBQ: The Spanish and Catalan Bias Benchmarks for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the notablelack of resources for social bias evaluation in languages other than English,and for social contexts outside of the United States, this paper introduces theSpanish and the Catalan Bias Benchmarks for Question Answering (EsBBQ andCaBBQ). |
VALLE RUIZ-FERNÁNDEZ et. al. | arxiv-cs.CL | 2025-07-15 |
| 351 | MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate multipleMLLMs using six visual analytical tasks, comparing their performance againstone another and a human baseline. |
Varun Srivastava; Fan Lei; Srija Mukhopadhyay; Vivek Gupta; Ross Maciejewski; | arxiv-cs.CL | 2025-07-15 |
| 352 | ExpliCIT-QA: Explainable Code-Based Image Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ExpliCIT-QA, a system that extends our previous MRT approach fortabular question answering into a multimodal pipeline capable of handlingcomplex table images and providing explainable answers. |
Maximiliano Hormazábal Lagos; Álvaro Bueno Sáez; Pedro Alonso Doval; Jorge Alcalde Vesteiro; Héctor Cerezo-Costas; | arxiv-cs.CL | 2025-07-15 |
| 353 | Warehouse Spatial Question Answering with LLM Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, wepresent a data-efficient approach. |
HSIANG-WEI HUANG et. al. | arxiv-cs.CV | 2025-07-14 |
| 354 | CoralVQA: A Large-Scale Visual Question Answering Dataset for Coral Reef Image Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduceCoralVQA, the first large-scale VQA dataset for coral reef analysis. |
Hongyong Han; Wei Wang; Gaowei Zhang; Mingjie Li; Yi Wang; | arxiv-cs.CV | 2025-07-14 |
| 355 | Wrong Answers Can Also Be Useful: PlausibleQA – A Large-Scale QA Dataset with Answer Plausibility Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing QA datasets primarily focus on correct answers without explicit consideration of the plausibility of other candidate answers, limiting opportunity for more nuanced evaluations of models. To address this gap, we introduce PlausibleQA, a large-scale dataset comprising 10,000 questions and 100,000 candidate answers, each annotated with plausibility scores and justifications for their selection. |
Jamshid Mozafari; Abdelrahman Abdallah; Bhawna Piryani; Adam Jatowt; | sigir | 2025-07-13 |
| 356 | WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present WebFAQ, a large-scale collection of open-domain question answering datasets derived from FAQ-style schema.org annotations. |
Michael Dinzinger; Laura Caspari; Kanishka Ghosh Dastidar; Jelena Mitrovi\'{c}; Michael Granitzer; | sigir | 2025-07-13 |
| 357 | Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for Deep Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Researchy Questions, the world’s first, only and largest public dataset of ”Deep Research” questions filtered from real search engine logs to be non-factoid, ”decompositional” and multi-perspective. |
CORBIN ROSSET et. al. | sigir | 2025-07-13 |
| 358 | Question-Answering Dense Video Events Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For improvement, we propose DeVi, a novel training-free MLLM approach that highlights a hierarchical captioning module, a temporal event memory module, and a self-consistency checking module to respectively detect, contextualize and memorize, and ground dense-events in long videos for question answering. |
Hangyu Qin; Junbin Xiao; Angela Yao; | sigir | 2025-07-13 |
| 359 | ClusterChat: Multi-Feature Search for Corpus Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ClusterChat. The demo video and source code are available at: https://github.com/achouhan93/ClusterChat, an open-source system for corpus exploration that integrates cluster-based organization of documents using textual embeddings with lexical and semantic search, timeline-driven exploration, and corpus and document-level question answering (QA) as multi-feature search capabilities. |
Ashish Chouhan; Saifeldin Mandour; Michael Gertz; | sigir | 2025-07-13 |
| 360 | PILs of Knowledge: A Synthetic Benchmark for Evaluating Question Answering Systems in Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, no dedicated benchmark currently exists to evaluate QA systems specifically on PILs, limiting progress in this domain. To address this gap, we introduce a fact-supported synthetic benchmark composed of multiple-choice questions and answers generated from real PILs. |
RICCARDO LUNARDI et. al. | sigir | 2025-07-13 |
| 361 | Dynamic-KGQA: A Scalable Framework for Generating Adaptive Question Answering Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Dynamic-KGQA, a scalable framework for generating adaptive QA datasets from knowledge graphs (KGs), designed to mitigate memorization risks while maintaining statistical consistency across iterations. |
Preetam Prabhu Srikar Dammu; Himanshu Naidu; Chirag Shah; | sigir | 2025-07-13 |
| 362 | Graph-Based Multimodal Contrastive Learning for Chart Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces a novel joint multimodal scene graph framework that explicitly models the relationships among chart components and their underlying structures. |
Yue Dai; Soyeon Caren Han; Wei Liu; | sigir | 2025-07-13 |
| 363 | NLQxform-UI: An Interactive and Intuitive Scholarly Question Answering System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop an interactive and intuitive scholarly question answering system called NLQxform-UI, which allows users to pose complex queries in the form of natural language questions. |
Ruijie Wang; Zhiruo Zhang; Luca Rossetto; Florian Ruosch; Abraham Bernstein; | sigir | 2025-07-13 |
| 364 | An Empirical Study of Evaluating Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We collect 5,236 factoid and non-factoid long-form answers generated by different large language models and conduct a human evaluation on 2,079 of them, focusing on correctness and informativeness. |
Ning Xian; Yixing Fan; Ruqing Zhang; Maarten de Rijke; Jiafeng Guo; | sigir | 2025-07-13 |
| 365 | CG-RAG: Research Question Answering By Citation Graph Retrieval-Augmented LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Contextualized Graph Retrieval-Augmented Generation (CG-RAG), a novel framework that integrates sparse and dense retrieval signals within graph structures to enhance retrieval efficiency and subsequently improve generation quality for research question answering. |
YUNTONG HU et. al. | sigir | 2025-07-13 |
| 366 | Evaluating LLMs’ (In)ability to Follow Prompts in QA Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address our research question, we propose Oedipus, an evaluation framework to evaluate LLMs’ ability to follow prompts. |
Aparup Khatua; Tobias Kalmbach; Prasenjit Mitra; Sandipan Sikdar; | sigir | 2025-07-13 |
| 367 | Understanding Large Language Model Performance in Software Engineering: A Large-scale Question Answering Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CodeRepoQA, a large-scale benchmark specifically designed for evaluating repository-level question-answering capabilities in the field of software engineering. |
RUIDA HU et. al. | sigir | 2025-07-13 |
| 368 | Towards Spatial Audio Understanding Via Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel framework for spatial audio understandingof first-order ambisonic (FOA) signals through a question answering (QA)paradigm, aiming to extend the scope of sound event localization and detection(SELD) towards spatial scene understanding and reasoning. |
Parthasaarathy Sudarsanam; Archontis Politis; | arxiv-cs.SD | 2025-07-12 |
| 369 | What Factors Affect LLMs and RLLMs in Financial Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To investigate the impact of variousmethods on LLMs and RLLMs, we utilize five LLMs and three RLLMs to assess theeffects of prompting methods, agentic frameworks, and multilingual alignmentmethods on financial question-answering tasks. |
PENG WANG et. al. | arxiv-cs.CL | 2025-07-11 |
| 370 | Exploring The Limits of Model Compression in LLMs: A Knowledge Distillation Study on QA Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate student models distilled from the Pythia and Qwen2.5families on two QA benchmarks, SQuAD and MLQA, under zero-shot and one-shotprompting conditions. |
Joyeeta Datta; Niclas Doll; Qusai Ramadan; Zeyd Boukhers; | arxiv-cs.CL | 2025-07-10 |
| 371 | FrugalRAG: Learning to Retrieve and Reason for Multi-hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we showthat: (1) Large-scale fine-tuning is not needed to improve RAG metrics,contrary to popular claims in recent literature. |
Abhinav Java; Srivathsan Koundinyan; Nagarajan Natarajan; Amit Sharma; | arxiv-cs.CL | 2025-07-10 |
| 372 | Data-Balanced Curriculum Learning for Audio Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current models struggle with dataset imbalancesand unstable training dynamics. This work combines curriculum learning withstatistical data balancing to address these challenges. |
Gijs Wijngaard; Elia Formisano; Michele Esposito; Michel Dumontier; | arxiv-cs.SD | 2025-07-09 |
| 373 | Barriers in Integrating Medical Visual Question Answering Into Radiology Workflows: A Scoping Review and Clinicians’ Insights Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study systematically reviews 68publications (2018-2024) and surveys 50 clinicians from India and Thailand toexamine MedVQA’s practical utility, challenges, and gaps. |
Deepali Mishra; Chaklam Silpasuwanchai; Ashutosh Modi; Madhumita Sushil; Sorayouth Chumnanvej; | arxiv-cs.CL | 2025-07-09 |
| 374 | Enhancing Food-Domain Question Answering with A Multimodal Knowledge Graph: Hybrid QA Generation and Diversity Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a unified food-domain QA framework that combines a large-scalemultimodal knowledge graph (MMKG) with generative AI. |
Srihari K B; Pushpak Bhattacharyya; | arxiv-cs.CL | 2025-07-09 |
| 375 | Enhancing Scientific Visual Question Answering Through Multimodal Reasoning and Ensemble Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conducted a series of experiments using models with 5B to 8B parameters.Our strongest individual model, InternVL3, achieved ROUGE-1 and ROUGE-L F1scores of \textbf{0.740} and a BERTScore of \textbf{0.983} on the SciVQA testsplit. |
Prahitha Movva; Naga Harshita Marupaka; | arxiv-cs.CV | 2025-07-08 |
| 376 | LLM-based Question-Answer Framework for Sensor-driven HVAC System Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inthis paper, we present JARVIS, a two-stage LLM-based QA framework tailored forsensor data-driven HVAC system interaction. |
SUNGMIN LEE et. al. | arxiv-cs.AI | 2025-07-07 |
| 377 | Building Open-Retrieval Conversational Question Answering Systems By Generating Synthetic Data and Decontextualizing User Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a pipeline thatcapitalizes on the abundance of plain text documents in organizations (e.g.,product documentation) to automatically produce realistic OR-CONVQA dialogswith annotations. |
CHRISTOS VLACHOS et. al. | arxiv-cs.CL | 2025-07-07 |
| 378 | Assessing The Capabilities and Limitations of FinGPT Model in Financial NLP Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work evaluates FinGPT, a financial domain-specific language model,across six key natural language processing (NLP) tasks: Sentiment Analysis,Text Classification, Named Entity Recognition, Financial Question Answering,Text Summarization, and Stock Movement Prediction. |
Prudence Djagba; Chimezie A. Odinakachukwu; | arxiv-cs.CL | 2025-07-06 |
| 379 | Beyond Independent Passages: Adaptive Passage Combination Retrieval for Retrieval Augmented Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Retrieval-augmented generation (RAG) enhances large language models (LLMs) byincorporating external documents at inference time, enabling up-to-dateknowledge access without costly retraining. |
Ting-Wen Ko; Jyun-Yu Jiang; Pu-Jen Cheng; | arxiv-cs.CL | 2025-07-05 |
| 380 | Coling-UniA at SciVQA 2025: Few-Shot Example Retrieval and Confidence-Informed Ensembling for Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our system for the SciVQA 2025 Shared Task on ScientificVisual Question Answering. |
Christian Jaumann; Annemarie Friedrich; Rainer Lienhart; | arxiv-cs.CL | 2025-07-03 |
| 381 | Chart Question Answering from Real-World Analytical Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new dataset for chart question answering (CQA) constructed fromvisualization notebooks. |
Maeve Hutchinson; Radu Jianu; Aidan Slingsby; Jo Wood; Pranava Madhyastha; | arxiv-cs.CL | 2025-07-02 |
| 382 | OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Incontrast, we propose an end-to-end agentic framework that embeds multi-turntool calls-using a BM25+-based search API and a SQLite SQL executor-directlyinto a large language model. |
Zipeng Qiu; | arxiv-cs.CL | 2025-07-02 |
| 383 | Retriever-generator-verification: A Novel Approach to Enhancing Factual Coherence in Open-domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
SHIQI SUN et. al. | Inf. Process. Manag. | 2025-07-01 |
| 384 | Temporal Chain of Thought: Long-Video Understanding By Thinking in Frames Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Temporal Chain of Thought, aninference strategy for video question-answering that curates the model’s inputcontext. |
Anurag Arnab; Ahmet Iscen; Mathilde Caron; Alireza Fathi; Cordelia Schmid; | arxiv-cs.LG | 2025-07-01 |
| 385 | Read The Docs Before Rewriting: Equip Rewriter with Domain Knowledge Via Continual Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in specialized domains,the rewriter model may struggle due to limited domain-specific knowledge. Toresolve this, we propose the R\&R (Read the doc before Rewriting) rewriter,which involves continual pre-training on professional documents, akin to howstudents prepare for open-book exams by reviewing textbooks. |
Qi Wang; Yixuan Cao; Yifan Liu; Jiangtao Zhao; Ping Luo; | arxiv-cs.IR | 2025-07-01 |
| 386 | A Knowledge Graph-enhanced Large Language Model for Question Answering of Hydraulic Structure Safety Management Related Papers Related Patents Related Grants Related Venues Related Experts View |
DONGLIANG ZHANG et. al. | Adv. Eng. Informatics | 2025-07-01 |
| 387 | Positional Bias in Binary Question Answering: How Uncertainty Shapes Model Preferences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we quantify and analyze positional biasacross five large language models under varying degrees of answer uncertainty.We re-adapted the SQuAD-it dataset by adding an extra incorrect answer optionand then created multiple versions with progressively less context and moreout-of-context answers, yielding datasets that range from low to highuncertainty. |
Tiziano Labruna; Simone Gallo; Giovanni Da San Martino; | arxiv-cs.CL | 2025-06-30 |
| 388 | MedEthicsQA: A Comprehensive Question Answering Benchmark for Medical Ethics Evaluation of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces $\textbf{MedEthicsQA}$, a comprehensivebenchmark comprising $\textbf{5,623}$ multiple-choice questions and$\textbf{5,351}$ open-ended questions for evaluation of medical ethics in LLMs.We systematically establish a hierarchical taxonomy integrating global medicalethical standards. |
JIANHUI WEI et. al. | arxiv-cs.CL | 2025-06-28 |
| 389 | Towards Probabilistic Question Answering Over Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a newbenchmark LUCARIO and a framework for probabilistic QA over large tabular data.Our method induces Bayesian Networks from tables, translates natural languagequeries into probabilistic queries, and uses large language models (LLMs) togenerate final answers. |
Chen Shen; Sajjadur Rahman; Estevam Hruschka; | arxiv-cs.CL | 2025-06-25 |
| 390 | MultiFinRAG: An Optimized Multimodal Retrieval-Augmented Generation (RAG) Framework for Financial Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MultiFinRAG, a retrieval-augmented generationframework purpose-built for financial QA. |
Chinmay Gondhalekar; Urjitkumar Patel; Fang-Chun Yeh; | arxiv-cs.CL | 2025-06-25 |
| 391 | FactTest: Factuality Testing in Large Language Models with Finite-Sample and Distribution-Free Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce FactTest, a novel framework that statistically assesses whether an LLM can provide correct answers to given questions with high-probability correctness guarantees. |
FAN NIE et. al. | icml | 2025-06-25 |
| 392 | Understanding Complexity in VideoQA Via Visual Program Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a data-driven approach to analyzing query complexity in Video Question Answering (VideoQA). |
CRISTOBAL EYZAGUIRRE et. al. | icml | 2025-06-25 |
| 393 | ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this, we introduce the Time-Series Question Answering (Time-Series QA) task and release EngineMT-QA, the first large-scale, multi-task, temporal-textual QA dataset designed to capture complex interactions between time-series signals and natural language. Building on this resource, we propose the Instruct Time Transformer (ITFormer), a novel framework that bridges time-series encoders with frozen large language models (LLMs). |
YILIN WANG et. al. | icml | 2025-06-25 |
| 394 | Divide and Conquer: Exploring Language-centric Tree Reasoning for Video Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Video Question-Answering (VideoQA) remains challenging in achieving advanced cognitive reasoning due to the uncontrollable and opaque reasoning processes in existing Multimodal Large Language Models (MLLMs). To address this issue, we propose a novel Language-centric Tree Reasoning (LTR) framework that targets on enhancing the reasoning ability of models. |
ZHAOHE LIAO et. al. | icml | 2025-06-25 |
| 395 | TUMTraf VideoQA: Dataset and Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TUMTraf VideoQA, a novel dataset and benchmark designed for spatio-temporal video understanding in complex roadside traffic scenarios. |
XINGCHENG ZHOU et. al. | icml | 2025-06-25 |
| 396 | 3D Question Answering Via Only 2D Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore how to harness their potential to address 3D scene understanding tasks, using 3D question answering (3D-QA) as a representative example. |
FENGYUN WANG et. al. | icml | 2025-06-25 |
| 397 | DocVXQA: Context-Aware Visual Explanations for Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose **DocVXQA**, a novel framework for visually self-explainable document question answering, where the goal is not only to produce accurate answers to questions but also to learn visual heatmaps that highlight critical regions, offering interpretable justifications for the model decision. |
MOHAMED ALI SOUIBGUI et. al. | icml | 2025-06-25 |
| 398 | Knowledge-Aware Diverse Reranking for Cross-Source Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Team Marikarp’s solution for the SIGIR 2025 LiveRAGcompetition. |
Tong Zhou; | arxiv-cs.CL | 2025-06-25 |
| 399 | Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Inference-Scaled GraphRAG, a novelframework that enhances LLM-based graph reasoning by applying inference-timecompute scaling. |
Travis Thompson; Seung-Hwan Lim; Paul Liu; Ruoying He; Dongkuan Xu; | arxiv-cs.CL | 2025-06-24 |
| 400 | Cause-Effect Driven Optimization for Robust Medical Visual Question Answering with Language Biases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing Medical Visual Question Answering (Med-VQA) models often suffer fromlanguage biases, where spurious correlations between question types and answercategories are inadvertently established. To address these issues, we propose anovel Cause-Effect Driven Optimization framework called CEDO, that incorporatesthree well-established mechanisms, i.e., Modality-driven HeterogeneousOptimization (MHO), Gradient-guided Modality Synergy (GMS), andDistribution-adapted Loss Rescaling (DLR), for comprehensively mitigatinglanguage biases from both causal and effectual perspectives. |
Huanjia Zhu; Yishu Liu; Xiaozhao Fang; Guangming Lu; Bingzhi Chen; | arxiv-cs.CV | 2025-06-22 |
| 401 | PDF Retrieval Augmented Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents an advancement in Question-Answering (QA) systems using aRetrieval Augmented Generation (RAG) framework to enhance informationextraction from PDF files. |
Thi Thu Uyen Hoang; Viet Anh Nguyen; | arxiv-cs.CL | 2025-06-22 |
| 402 | A Comprehensive Graph Framework for Question Answering with Mode-Seeking Preference Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in retrieval-augmented generation (RAG) have enhancedlarge language models in question answering by integrating external knowledge.However, challenges persist in achieving global understanding and aligningresponses with human ethical and quality preferences. To address these issues,we propose GraphMPA, a comprehensive graph-based framework with mode-seekingpreference alignment. |
QUANWEI TANG et. al. | arxiv-cs.CL | 2025-06-22 |
| 403 | UNITQA: A Unified Automated Tabular Question Answering System with Multi-Agent Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automated tabular question answering (TQA) has attracted significant attention in data analysis and natural language processing communities due to its powerful capabilities. The … |
JUN-PENG ZHU et. al. | Companion of the 2025 International Conference on … | 2025-06-22 |
| 404 | MUPA: Towards Multi-Path Agentic Reasoning for Grounded Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose MUPA, a cooperativeMUlti-Path Agentic approach that unifies video grounding, question answering,answer reflection and aggregation to tackle Grounded VideoQA. |
JISHENG DANG et. al. | arxiv-cs.CV | 2025-06-22 |
| 405 | LastingBench: Defend Benchmarks Against Knowledge Leakage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduceLastingBench, a novel framework designed to continuously reinforce andsafeguard existing benchmarks against knowledge leakage. |
Yixiong Fang; Tianran Sun; Yuling Shi; Min Wang; Xiaodong Gu; | arxiv-cs.CL | 2025-06-21 |
| 406 | Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a novel frameworkcalled DEC (Dynamic Enhancement Chain). |
BINQUAN JI et. al. | arxiv-cs.CL | 2025-06-21 |
| 407 | RAGentA: Multi-Agent Retrieval-Augmented Generation for Attributed Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present RAGentA, a multi-agent retrieval-augmented generation (RAG)framework for attributed question answering (QA) with large language models(LLMs). |
Ines Besrour; Jingbo He; Tobias Schreieder; Michael Färber; | arxiv-cs.IR | 2025-06-20 |
| 408 | ESapiens: A Real-World NLP Framework for Multimodal Document Understanding and Enterprise Knowledge Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce eSapiens, a unified question-answering system designed forenterprise settings, which bridges structured databases and unstructuredtextual corpora via a dual-module architecture. |
ISAAC SHI et. al. | arxiv-cs.IR | 2025-06-20 |
| 409 | How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate whether off-the-shelf Multimodal Large Language Models (MLLMs)can tackle Online Episodic-Memory Video Question Answering (OEM-VQA) withoutadditional training. |
Giuseppe Lando; Rosario Forte; Giovanni Maria Farinella; Antonino Furnari; | arxiv-cs.CV | 2025-06-19 |
| 410 | Enhancing Document-Level Question Answering Via Multi-Hop Retrieval-Augmented Generation with LLaMA 3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel Retrieval-Augmented Generation (RAG) frameworktailored for complex question answering tasks, addressing challenges inmulti-hop reasoning and contextual understanding across lengthy documents.Built upon LLaMA 3, the framework integrates a dense retrieval module withadvanced context fusion and multi-hop reasoning mechanisms, enabling moreaccurate and coherent response generation. |
XINYUE HUANG et. al. | arxiv-cs.CL | 2025-06-19 |
| 411 | MEGC2025: Micro-Expression Grand Challenge on Spot Then Recognize and Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Facial micro-expressions (MEs) are involuntary movements of the face thatoccur spontaneously when a person experiences an emotion but attempts tosuppress or repress the facial … |
XINQI FAN et. al. | arxiv-cs.CV | 2025-06-18 |
| 412 | Evaluating Multimodal Large Language Models on Educational Textbook Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents the first evaluation ofstate-of-the-art MLLMs, including LLaVA-1.5 and LLaMA 3.2-Vision, on thetextbook question answering (TQA) task using the CK12-QA dataset. |
HESSA A. ALAWWAD et. al. | arxiv-cs.CL | 2025-06-18 |
| 413 | MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most notably, existing methods overlook the distinction between factoid and non-factoid questions. To address these challenges, we propose \textbf{MinosEval}, a novel evaluation method that first distinguishes open-ended questions and then ranks candidate answers using different evaluation strategies. |
YONGQI FAN et. al. | arxiv-cs.CL | 2025-06-18 |
| 414 | A Multilingual Multimodal Medical Examination Dataset for Visual Question Answering in Healthcare Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Vision-Language Models (VLMs) excel in multimodal tasks, yet their effectiveness in specialized medical applications remains underexplored. Accurate interpretation of medical … |
GIUSEPPE RICCIO et. al. | 2025 IEEE 38th International Symposium on Computer-Based … | 2025-06-18 |
| 415 | Adapting Lightweight Vision Language Models for Radiological Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we fine-tune a lightweight 3B parameter vision-language model for Radiological VQA, demonstrating that small models, when appropriately tuned with curated data, can achieve robust performance across both open- and closed-ended questions. |
Aditya Shourya; Michel Dumontier; Chang Sun; | arxiv-cs.CV | 2025-06-17 |
| 416 | Moment Sampling in Video LLMs for Long-Form Video QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, weinvestigate the use of a general-purpose text-to-video moment retrieval modelto guide the frame sampling process. |
MUSTAFA CHASMAI et. al. | arxiv-cs.CV | 2025-06-17 |
| 417 | ArgHiTZ at ArchEHR-QA 2025: A Two-Step Divide and Conquer Approach to Patient Question Answering for Top Factuality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an end-to-end prompt-based baseline and two two-step methods to divide the task, without utilizing any external knowledge. |
ADRIÁN CUADRÓN et. al. | arxiv-cs.CL | 2025-06-15 |
| 418 | Med-U1: Incentivizing Unified Medical Reasoning in LLMs Via Large-scale Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Med-U1, a unified framework for robustreasoning across medical QA tasks with diverse output formats, ranging fromMCQs to complex generation and computation tasks. |
XIAOTIAN ZHANG et. al. | arxiv-cs.CL | 2025-06-13 |
| 419 | Exploring The Potential of Multimodal Large Language Models for Question Answering on Artworks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper investigates the application of a Multimodal Large Language Model to enhance visitor experiences in cultural heritage settings through Visual Question Answering (VQA) … |
Alessio Ferrato; Carla Limongelli; Fabio Gasparetti; Giuseppe Sansonetti; A. Micarelli; | Adjunct Proceedings of the 33rd ACM Conference on User … | 2025-06-12 |
| 420 | Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present Neural, the runner-up in the BioNLP 2025 ArchEHR-QA shared task on evidence-grounded clinical QA. |
SAI PRASANNA TEJA REDDY BOGIREDDY et. al. | arxiv-cs.LG | 2025-06-12 |
| 421 | Mind The Gap: Benchmarking LLM Uncertainty, Discrimination, and Calibration in Specialty-Aware Clinical QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate uncertainty estimation methods for clinical QAfocusing, for the first time, on eleven clinical specialties and six questiontypes, and across ten open-source LLMs (general-purpose, biomedical, andreasoning models). |
Alberto Testoni; Iacer Calixto; | arxiv-cs.CL | 2025-06-12 |
| 422 | Dynamic Double Space Tower Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods often have difficulty handling complex reasoning scenarios due to insufficient cross-modal interaction and capturing the entity spatial relationships in the image. \cite{huang2023adaptive}\cite{liu2021comparing}\cite{guibas2021adaptive}\cite{zhang2022vsa}We studied a brand-new approach to replace the attention mechanism in order to enhance the reasoning ability of the model and its understanding of spatial relationships.Specifically, we propose a dynamic bidirectional spatial tower, which is divided into four layers to observe the image according to the principle of human gestalt vision. |
Weikai Sun; Shijie Song; Han Wang; | arxiv-cs.CV | 2025-06-12 |
| 423 | Team Anotheroption at SemEval-2025 Task 8: Bridging The Gap Between Open-Source and Proprietary LLMs in Table QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a system developed for SemEval 2025 Task 8: Question Answering (QA) over tabular data. |
Nikolas Evkarpidi; Elena Tutubalina; | arxiv-cs.CL | 2025-06-11 |
| 424 | Token Constraint Decoding Improves Robustness on Question Answering for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce and evaluate Token Constraint Decoding (TCD). |
JUI-MING YAO et. al. | arxiv-cs.CL | 2025-06-11 |
| 425 | ICT-QA: Question Answering Over Multi-Modal Contexts Including Image, Chart, and Text Modalities Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: For question answering in multi-modal contexts that include image, chart, and text modalities, a model must be proficient in understanding each individual modality. Furthermore, … |
YOUNGROK JANG et. al. | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 426 | CadenceRAG: Context-Aware and Dependency-Enhanced Retrieval Augmented Generation for Holistic Video Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper addresses the challenging problem of holistic video understanding, focusing on rich-text-based video re-trieval and question answering. Compared to simple video … |
HENG LIU et. al. | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 427 | An LLM Framework for Long-Form Video Retrieval and Audio-Visual Question Answering Using Qwen2/2.5 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper presents our approach to tackle the tasks of Known Item Search (KIS) and Video Question Answering (Video QA) by combining state-of-the-art LLMs and cross-modal video … |
Damianos Galanopoulos; Andreas Goulas; Antonios Leventakis; Ioannis Patras; Vasileios Mezaris; | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 428 | VRAG: Retrieval-Augmented Video Question Answering for Long-Form Videos Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The rapid expansion of video data across various domains has heightened the demand for efficient retrieval and question-answering systems, particularly for long-form videos. … |
BAO TRAN GIA et. al. | 2025 IEEE/CVF Conference on Computer Vision and Pattern … | 2025-06-11 |
| 429 | Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-k Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Retrieval-augmented generation (RAG) and long-context language models (LCLMs) both address context limitations of LLMs in open-domain question answering (QA). However, optimal … |
Chihiro Taguchi; Seiji Maekawa; Nikita Bhutani; | ArXiv | 2025-06-10 |
| 430 | Improved LLM Agents for Financial Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building upon this framework, this paper examines the effectiveness of the traditional critic agent when oracle labels are not available, and show, through experiments, that this critic agent’s performance deteriorates in this scenario. With this in mind, we present an improved critic agent, along with the calculator agent which outperforms the previous state-of-the-art approach (program-of-thought) and is safer. |
NELVIN TAN et. al. | arxiv-cs.CL | 2025-06-10 |
| 431 | Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive-$k$ Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Retrieval-augmented generation (RAG) and long-context language models (LCLMs)both address context limitations of LLMs in open-domain question answering(QA). However, optimal … |
Chihiro Taguchi; Seiji Maekawa; Nikita Bhutani; | arxiv-cs.CL | 2025-06-10 |
| 432 | CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Medical question answering (QA) benchmarks often focus on multiple-choice orfact-based tasks, leaving open-ended answers to real patient questionsunderexplored. This gap is … |
YAHAN LI et. al. | arxiv-cs.CL | 2025-06-10 |
| 433 | VidBridge-R1: Bridging QA and Captioning for RL-based Video Understanding Models with Intermediate Proxy Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Naively combining reward signals from these tasks results in mutualperformance degradation, which we attribute to a conflict between theiropposing task natures. To address this challenge, we propose a novel trainingframework built upon two intermediate proxy tasks: DarkEventInfer, whichpresents videos with masked event segments, requiring models to infer theobscured content based on contextual video cues; and MixVidQA, which presentsinterleaved video sequences composed of two distinct clips, challenging modelsto isolate and reason about one while disregarding the other. |
XINLONG CHEN et. al. | arxiv-cs.CV | 2025-06-09 |
| 434 | HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel image-text reasoning supervised fine-tuning data generation pipeline to create specific domain quadruplets (image, question, thinking trace, and answer) from domain-specific materials. |
Shijie Wang; Yilun Zhang; Zeyu Lai; Dexing Kong; | arxiv-cs.AI | 2025-06-09 |
| 435 | ScIRGen: Synthesize Realistic and Large-Scale RAG Dataset for Scientific Research Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Scientific researchers need intensive information about datasets to effectively evaluate and develop theories and methodologies. The information needs regarding datasets are … |
JUNYONG LIN et. al. | ArXiv | 2025-06-09 |
| 436 | Looking Beyond Visible Cues: Implicit Video Question Answering Via Dual-Clue Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To tackle I-VQA, we propose a novel reasoning framework, IRM (Implicit Reasoning Model), incorporating dual-stream modeling of contextual actions and intent clues as implicit reasoning chains. |
TIEYUAN CHEN et. al. | arxiv-cs.CV | 2025-06-09 |
| 437 | KG2QA: Knowledge Graph-enhanced Retrieval-augmented Generation for Communication Standards Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The rapid evolution of communication technologies has led to an explosion ofstandards, rendering traditional expert-dependent consultation methodsinefficient and slow. To address this challenge, we propose \textbf{KG2QA}, aquestion answering (QA) framework for communication standards that integratesfine-tuned large language models (LLMs) with a domain-specific knowledge graph(KG) via a retrieval-augmented generation (RAG) pipeline. |
Zhongze Luo; Weixuan Wan; Tianya Zhang; Dan Wang; Xiaoying Tang; | arxiv-cs.CL | 2025-06-08 |
| 438 | M-LLM Based Video Frame Selection for Efficient Video Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it could lose crucial context in certain periods of a video, so that the downstream M-LLM may not have sufficient visual information to answer a question. To attack this pain point, we propose a light-weight M-LLM -based frame selection method that adaptively select frames that are more relevant to users’ queries. |
KAI HU et. al. | cvpr | 2025-06-07 |
| 439 | VITED: Video Temporal Evidence Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, they lack the ability to temporally localize such evidence in the broader context of the full video, which is required for answering complex questions. We propose a framework to enhance existing VideoQA datasets with evidence reasoning chains, automatically constructed by searching for optimal intervals of interest in the video with supporting evidence, that maximizes the likelihood of answering a given question.We train our model (ViTED) to generate these evidence chains directly, enabling it to both localize evidence windows as well as perform multi-step reasoning across them in long-form video content.We show the value of our evidence-distilled models on a suite of long video QA benchmarks where we outperform state-of-the-art approaches that lack evidence reasoning capabilities. |
Yujie Lu; Yale Song; William Wang; Lorenzo Torresani; Tushar Nagarajan; | cvpr | 2025-06-07 |
| 440 | Learning to Clarify By Reinforcement Learning Through Reward-Weighted Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we learn to ask clarifying questions in QA agents. |
SUBHOJYOTI MUKHERJEE et. al. | arxiv-cs.CL | 2025-06-07 |
| 441 | Cross-modal Causal Relation Alignment for Video Question Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel VideoQG framework named Cross-modal Causal Relation Alignment (CRA), to eliminate spurious correlations and improve the causal consistency between question-answering and video temporal grounding. |
WEIXING CHEN et. al. | cvpr | 2025-06-07 |
| 442 | Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we transform the KBVQA into linguistic question-answering tasks so that we can leverage the rich world knowledge and strong reasoning abilities of Large Language Models (LLMs). |
ZHEN YANG et. al. | cvpr | 2025-06-07 |
| 443 | Zero-shot 3D Question Answering Via Voxel-based Dynamic Token Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Common methods such as token pooling, reduce visual token usage but often lead to information loss, impairing the model’s ability to preserve visual details essential for 3D question answering tasks. To address this, we propose voxel-based Dynamic Token Compression (DTC), which combines 3D spatial priors and visual semantics to achieve over 90% reduction in visual tokens usage for current multi-frame VLMs. |
HSIANG-WEI HUANG et. al. | cvpr | 2025-06-07 |
| 444 | Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, MLLM may lack fine-grained perception of visual features, which can result in hallucinations during reasoning. To address these challenges, we propose Notes-guided MLLM Reasoning (NoteMR), a novel framework that guides MLLM in better reasoning by utilizing knowledge notes and visual notes. |
Wenlong Fang; Qiaofeng Wu; Jing Chen; Yun Xue; | cvpr | 2025-06-07 |
| 445 | AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, a novel benchmark for audio-visual question answering continual learning (AVQACL) is introduced, aiming to study fine-grained scene understanding and spatial-temporal reasoning in videos under a continual learning setting. |
Kaixuan Wu; Xinde Li; Xinling Li; Chuanfei Hu; Guoliang Wu; | cvpr | 2025-06-07 |
| 446 | Unveiling The Mist Over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To unveil the "mist", we propose Beacon3D, a benchmark for 3D-VL grounding and QA tasks, delivering a perspective shift in the evaluation of 3D-VL understanding. |
JIANGYONG HUANG et. al. | cvpr | 2025-06-07 |
| 447 | EgoLife: Towards Egocentric Life Assistant Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce EgoLife, a project to develop an egocentric life assistant that accompanies and enhances personal efficiency through AI-powered wearable glasses. |
JINGKANG YANG et. al. | cvpr | 2025-06-07 |
| 448 | Flexible Frame Selection for Efficient Video Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the Flexible Frame Selector (FFS), a learnable policy model with a new flexible selection operation, that helps alleviate input context restrictions by enabling video-language models to focus on the most informative frames for the downstream multimodal task, without adding undue processing cost. |
Shyamal Buch; Arsha Nagrani; Anurag Arnab; Cordelia Schmid; | cvpr | 2025-06-07 |
| 449 | EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce EgoTextVQA, a novel and rigorously constructed benchmark for egocentric QA assistance involving scene text. |
SHENG ZHOU et. al. | cvpr | 2025-06-07 |
| 450 | BIMBA: Selective-Scan Compression for Long-Range Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce BIMBA, an efficient state-space model to handle long-form videos. |
Md Mohaiminul Islam; Tushar Nagarajan; Huiyu Wang; Gedas Bertasius; Lorenzo Torresani; | cvpr | 2025-06-07 |
| 451 | DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Dual-vision Scene Perception Network (DSPNet), to comprehensively integrate multi-view and point cloud features to improve robustness in 3D QA. |
JINGZHOU LUO et. al. | cvpr | 2025-06-07 |
| 452 | Commonsense Video Question Answering Through Video-Grounded Entailment Tree Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the first video-grounded entailment tree reasoning method for commonsense video question answering (VQA). |
Huabin Liu; Filip Ilievski; Cees G. M. Snoek; | cvpr | 2025-06-07 |
| 453 | CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an MLLMs-based dual momentum Mixture-of-Experts (\texttt CL-MoE ) framework for continual visual question answering. |
TIANYU HUAI et. al. | cvpr | 2025-06-07 |
| 454 | BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the gap, we present BioMol-MQA, a new question-answering (QA) dataset on polypharmacy, which is composed of two parts (i) a multimodal knowledge graph (KG) with text and molecular structure for information retrieval; and (ii) challenging questions that designed to test LLM capabilities in retrieving and reasoning over multimodal KG to answer questions. |
Saptarshi Sengupta; Shuhua Yang; Paul Kwong Yu; Fali Wang; Suhang Wang; | arxiv-cs.CL | 2025-06-06 |
| 455 | EASG-Bench: Video Q&A Benchmark with Egocentric Action Scene Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce EASG-Bench, a question-answering benchmark for egocentric videoswhere the question-answering pairs are created from spatio-temporally groundeddynamic scene graphs capturing intricate relationships among actors, actions,and objects. |
IVAN RODIN et. al. | arxiv-cs.CV | 2025-06-06 |
| 456 | TextVidBench: A Benchmark for Long Video Scene Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite recent progress on the short-video Text-Visual Question Answering (ViteVQA) task – largely driven by benchmarks such as M4-ViteVQA – existing datasets still suffer from limited video duration and narrow evaluation scopes, making it difficult to adequately assess the growing capabilities of powerful multimodal large language models (MLLMs). To address these limitations, we introduce TextVidBench, the first benchmark specifically designed for long-video text question answering (>3 minutes). |
YANGYANG ZHONG et. al. | arxiv-cs.CV | 2025-06-05 |
| 457 | UTSA-NLP at ArchEHR-QA 2025: Improving EHR Question Answering Via Self-Consistency Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe our system for the ArchEHR-QA Shared Task on answering clinical questions using electronic health records (EHRs). |
Sara Shields-Menard; Zach Reimers; Joshua Gardner; David Perry; Anthony Rios; | arxiv-cs.CL | 2025-06-05 |
| 458 | Trustworthy Medical Question Answering: An Evaluation-Centric Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this survey, we systematically examine six key dimensions of trustworthiness in medical QA, i.e., Factuality, Robustness, Fairness, Safety, Explainability, and Calibration. |
YINUO WANG et. al. | arxiv-cs.CL | 2025-06-04 |
| 459 | Plugging Schema Graph Into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existingmethods based on semantic similarity work well only on simplified hand-crafteddatasets and struggle to handle complex, real-world scenarios with numerous anddiverse columns. To address this, we propose a graph-based framework thatleverages human-curated relational knowledge to explicitly encode schema linksand join paths. |
Xixi Wang; Miguel Costa; Jordanka Kovaceva; Shuai Wang; Francisco C. Pereira; | arxiv-cs.AI | 2025-06-04 |
| 460 | Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a hierarchical question-answering (QA) approach for scene understanding in autonomous vehicles, balancing cost-efficiency with detailed visual interpretation. |
Safaa Abdullahi Moallim Mohamud; Minjin Baek; Dong Seog Han; | arxiv-cs.CV | 2025-06-03 |
| 461 | ESGenius: Benchmarking LLMs on Environmental, Social, and Governance (ESG) and Sustainability Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ESGenius, a comprehensive benchmark for evaluating and enhancingthe proficiency of Large Language Models (LLMs) in Environmental, Social, andGovernance (ESG) and sustainability-focused question answering. |
CHAOYUE HE et. al. | arxiv-cs.CL | 2025-06-02 |
| 462 | IQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Knowledge Base Question Answering (KBQA), which queries and reasonsover KGs, is central to this effort, especially for complex, multi-hop queries.However, multi-hop reasoning poses two key challenges: (1)~maintaining coherentreasoning paths, and (2)~avoiding prematurely discarding critical multi-hopconnections. To address these issues, we introduce iQUEST, a question-guidedKBQA framework that iteratively decomposes complex queries into simplersub-questions, ensuring a structured and focused reasoning trajectory.Additionally, we integrate a Graph Neural Network (GNN) to look ahead andincorporate 2-hop neighbor information at each reasoning step. |
Shuai Wang; Yinan Yu; | arxiv-cs.CL | 2025-06-02 |
| 463 | COMPKE: Complex Question Answering Under Knowledge Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we argue that these benchmarks fail to effectively evaluate how well the updated models apply this knowledge in real-life scenarios, particularly when questions require complex reasoning, involving one-to-many relationships or multi-step logical intersections. To fill in this gap, we introduce a new benchmark, COMPKE: Complex Question Answering under Knowledge Editing, which includes 11,924 complex questions that reflect real-life situations. |
KEYUAN CHENG et. al. | arxiv-cs.CL | 2025-06-01 |
| 464 | Thoughtful and Cautious Reasoning: A Fine-tuned Knowledge Graph-based Multi-hop Question Answering Framework Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yinghao Zheng; Ling Lu; Yang Hu; Yinong Chen; Aijuan Wang; | Eng. Appl. Artif. Intell. | 2025-06-01 |
| 465 | MKGF: A Multi-modal Knowledge Graph Based RAG Framework to Enhance LVLMs for Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
YINAN WU et. al. | Neurocomputing | 2025-06-01 |
| 466 | DeepRAG: Integrating Hierarchical Reasoning and Process Supervision for Biomedical Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose DeepRAG, a novel framework that integrates DeepSeek hierarchical question decomposition capabilities with RAG Gym unified retrieval-augmented generation optimization using process level supervision. |
YUELYU JI et. al. | arxiv-cs.CL | 2025-05-31 |
| 467 | Inter-Passage Verification for Multi-evidence Multi-answer QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multi-answer question answering (QA), where questions can have many valid answers, presents a significant challenge for existing retrieval-augmented generation-based QA systems, as these systems struggle to retrieve and then synthesize a large number of evidence passages. To tackle these challenges, we propose a new multi-answer QA framework — Retrieval-augmented Independent Reading with Inter-passage Verification (RI$^2$VER). |
Bingsen Chen; Shengjie Wang; Xi Ye; Chen Zhao; | arxiv-cs.CL | 2025-05-31 |
| 468 | Probing The Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models (LLMs) are trained on extensive datasets that encapsulate substantial world knowledge. |
YUNTAI BAO et. al. | arxiv-cs.CL | 2025-05-31 |
| 469 | OntoRAG: Enhancing Question-Answering Through Automated Ontology Derivation from Unstructured Knowledge Bases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces OntoRAG, an automated pipeline designed to derive ontologies from unstructured knowledge bases, with a focus on electrical relay documents. |
Yash Tiwari; Owais Ahmad Lone; Mayukha Pal; | arxiv-cs.AI | 2025-05-31 |
| 470 | Exploring The Impact of Occupational Personas on Domain-Specific QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study analyzes whether personas enhance specialized QA performance by introducing two types of persona: Profession-Based Personas (PBPs) (e.g., scientist), which directly relate to domain expertise, and Occupational Personality-Based Personas (OPBPs) (e.g., scientific person), which reflect cognitive tendencies rather than explicit expertise. |
Eojin Kang; Jaehyuk Yu; Juae Kim; | arxiv-cs.CL | 2025-05-30 |
| 471 | Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Grid-based Local and Global Area Transcription(Grid-LoGAT) system for Video Question Answering (VideoQA). |
MD INTISAR CHOWDHURY et. al. | arxiv-cs.CV | 2025-05-30 |
| 472 | ComposeRAG: A Modular and Composable RAG for Corpus-Grounded Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ComposeRAG, a novel modular abstraction that decomposes RAG pipelines into atomic, composable modules. |
RUOFAN WU et. al. | arxiv-cs.CL | 2025-05-30 |
| 473 | TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing evaluation datasets are limited in scope and primarily text-based, lacking a unified and standardized multimodal question-answering (QA) benchmark. To address this issue, we introduce TCM-Ladder, the first multimodal QA dataset specifically designed for evaluating large TCM language models. |
JIACHENG XIE et. al. | arxiv-cs.CL | 2025-05-29 |
| 474 | MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce the MedPAIR (Medical Dataset Comparing Physicians and AI Relevance Estimation and Question Answering) dataset to evaluate how physician trainees and LLMs prioritize relevant information when answering QA questions. |
YUEXING HAO et. al. | arxiv-cs.CL | 2025-05-29 |
| 475 | Climate Finance Bench Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We curate 33 recent sustainability reports in English drawn from companies across all 11 GICS sectors and annotate 330 expert-validated question-answer pairs that span pure extraction, numerical reasoning, and logical reasoning. Building on this dataset, we propose a comparison of RAG (retrieval-augmented generation) approaches. |
RAFIK MANKOUR et. al. | arxiv-cs.CL | 2025-05-28 |
| 476 | Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on Mobile Intel CPUs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an efficient transformer-based question-answering (QA) model optimized for deployment on a 13th Gen Intel i7-1355U CPU, using the Stanford Question Answering Dataset (SQuAD) v1.1. |
Ngeyen Yinkfu; | arxiv-cs.CL | 2025-05-28 |
| 477 | Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these approaches have shown strong performance, they often rely on a single, generalized reasoning strategy, overlooking the unique characteristics of each modality ultimately limiting both accuracy and interpretability. To address these limitations, we propose MAMMQA, a multi-agent QA framework for multimodal inputs spanning text, tables, and images. |
Krishna Singh Rajput; Tejas Anvekar; Chitta Baral; Vivek Gupta; | arxiv-cs.CL | 2025-05-27 |
| 478 | RISE: Reasoning Enhancement Via Iterative Self-Exploration in Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Retrieval-Augmented Generation (RAG), widely employed in MHQA tasks, faces challenges in effectively filtering noisy data and retrieving all necessary evidence, thereby limiting its effectiveness in addressing MHQA challenges. To address these challenges, we propose RISE:Reasoning Enhancement via Iterative Self-Exploration, a novel framework designed to enhance models’ reasoning capability through iterative self-exploration. |
BOLEI HE et. al. | arxiv-cs.CL | 2025-05-27 |
| 479 | Faithfulness-Aware Uncertainty Quantification for Fact-Checking The Output of Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce FRANQ (Faithfulness-based Retrieval Augmented UNcertainty Quantification), a novel method for hallucination detection in RAG outputs. |
EKATERINA FADEEVA et. al. | arxiv-cs.CL | 2025-05-27 |
| 480 | It’s High Time: A Survey of Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this survey, we provide a comprehensive overview of TemporalQuestion Answering (TQA), a research area that focuses on answering questionsinvolving temporal constraints or context. |
Bhawna Piryani; Abdelrahman Abdallah; Jamshid Mozafari; Avishek Anand; Adam Jatowt; | arxiv-cs.CL | 2025-05-26 |
| 481 | Hierarchical Retrieval with Evidence Curation for Open-Domain Financial Question Answering on Standardized Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This similarity forces traditional RAG methods to misidentify near-duplicate text, leading to duplicate retrieval that undermines accuracy and completeness. To address these issues, we propose the Hierarchical Retrieval with Evidence Curation (HiREC) framework. |
Jaeyoung Choe; Jihoon Kim; Woohwan Jung; | arxiv-cs.IR | 2025-05-26 |
| 482 | GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While synthetic data generation has emerged as a promising solution, existing approaches frequently suffer from factual inaccuracies, insufficient long-tail coverage, simplistic knowledge structures, and homogenized outputs. To address these challenges, we introduce GraphGen, a knowledge graph-guided framework designed for three key question-answering (QA) scenarios: atomic QA, aggregated QA, and multi-hop QA. |
ZIHONG CHEN et. al. | arxiv-cs.CL | 2025-05-26 |
| 483 | Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subset-specific performance showed Gemini 2.0 Flash excelled in Binary_CN (0.687), Single-choice_CN (0.666), and Single-choice_EN (0.646), while GPT-4o ranked highest in Binary_EN (0.717), Open-ended_CN (BLEU-1: 0.301; BERTScore: 0.382), and Open-ended_EN (BLEU-1: 0.183; BERTScore: 0.240). Conclusions: This study presents the first bilingual VQA benchmark for ophthalmology, distinguished by its real-world context and inclusion of multiple examinations per patient. |
PUSHENG XU et. al. | arxiv-cs.CV | 2025-05-26 |
| 484 | AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Bias linked to race, sex, and socioeconomic status is already well known, but a consistent and automatic testbed for measuring it is missing. To fill this gap, this paper presents AMQA — an Adversarial Medical Question-Answering dataset — built for automated, large-scale bias evaluation of LLMs in medical QA. |
YING XIAO et. al. | arxiv-cs.AI | 2025-05-26 |
| 485 | Automated Text-to-Table for Reasoning-Intensive Table QA: Pipeline Design and Benchmarking Insights Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing research is constrained by two primary bottlenecks: 1) Reliance on costly manually annotated real-world data, which is difficult to cover complex reasoning scenarios; 2) The heterogeneity of table structures hinders systematic analysis of the intrinsic mechanisms behind the underperformance of LLMs, especially in reasoning-intensive tasks. To address these issues, we propose an automated generation pipeline AutoT2T that transforms mathematical word problems into table-based reasoning tasks, eliminating the need for manual annotation. |
SHI-YU TIAN et. al. | arxiv-cs.AI | 2025-05-26 |
| 486 | Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a discourse-awarehierarchical framework that leverages rhetorical structure theory (RST) toenhance long document question answering. |
Huiyao Chen; Yi Yang; Yinghui Li; Meishan Zhang; Min Zhang; | arxiv-cs.IR | 2025-05-26 |
| 487 | IndustryEQA: Pushing The Frontiers of Embodied Question Answering in Industrial Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This drawback limits the evaluation of agent readiness for real-world industrial applications. To bridge this, we introduce IndustryEQA, the first benchmark dedicated to evaluating embodied agent capabilities within safety-critical warehouse scenarios. |
YIFAN LI et. al. | arxiv-cs.CV | 2025-05-26 |
| 488 | MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, most existing methods adopt cross-modal promptisolation, constructing visual and textual prompts separately, whichexacerbates modality imbalance and leads to degraded performance over time. Totackle this issue, we propose MM-Prompt, a novel framework incorporatingcross-modal prompt query and cross-modal prompt recovery. |
Xu Li; Fan Lyu; | arxiv-cs.CV | 2025-05-25 |
| 489 | Hypercube-Based Retrieval-Augmented Generation for Scientific Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thiswork, we introduce a multi-dimensional (cube) structure, Hypercube, which canindex and allocate documents in a pre-defined multi-dimensional space. |
JIMENG SHI et. al. | arxiv-cs.LG | 2025-05-25 |
| 490 | GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel four-stage framework called Grounding Caption-Guided Knowledge-Based Visual Question Answering (GC-KBVQA), which enables LLMs to effectively perform zero-shot VQA tasks without the need for end-to-end multimodal training. |
Mohammad Mahdi Moradi; Sudhir Mudur; | arxiv-cs.CL | 2025-05-25 |
| 491 | SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce SpokenNativQA, the first multilingual and culturally aligned spoken question-answering (SQA) dataset designed to evaluate LLMs in real-world conversational settings. |
Firoj Alam; Md Arid Hasan; Shammur Absar Chowdhury; | arxiv-cs.CL | 2025-05-25 |
| 492 | Toward Human Centered Interactive Clinical Question Answering System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces an interactive QA system that enables physicians to query clinical notes via text or voice and receive extractive answers highlighted directly in the note for traceability. |
Dina Albassam; | arxiv-cs.HC | 2025-05-24 |
| 493 | Enhancing Large Vision-Language Models with Layout Modality for Table Question Answering on Japanese Annual Securities Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a method to enhance LVLM-based table understanding by incorporating in-table textual content and layout features. |
Hayato Aida; Kosuke Takahashi; Takahiro Omi; | arxiv-cs.CL | 2025-05-23 |
| 494 | How Knowledge Popularity Influences and Enhances LLM Knowledge Boundary Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate how knowledge popularity affects LLMs’ ability to perceive their knowledge boundaries. |
Shiyu Ni; Keping Bi; Jiafeng Guo; Xueqi Cheng; | arxiv-cs.CL | 2025-05-23 |
| 495 | PerMedCQA: Benchmarking Large Language Models on Medical Consumer Question Answering in Persian Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite recent advances in large language models (LLMs) for medical QA, consumer-oriented and multilingual resources, particularly in low-resource languages like Persian, remain sparse. To bridge this gap, we present PerMedCQA, the first Persian-language benchmark for evaluating LLMs on real-world, consumer-generated medical questions. |
Naghmeh Jamali; Milad Mohammadi; Danial Baledi; Zahra Rezvani; Hesham Faili; | arxiv-cs.CL | 2025-05-23 |
| 496 | O$^2$-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Open-ended questions, which characterized by lacking a standard answer or providing non-unique and diverse answers, remain underexplored. To bridge this gap, we present O$^2$-Searcher, a novel search agent leveraging reinforcement learning to effectively tackle both open-ended and closed-ended questions in the open domain. |
JIANBIAO MEI et. al. | arxiv-cs.CL | 2025-05-22 |
| 497 | T$^2$: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: But they add human bias to the reasoning process and fail to leverage models’ inherent reasoning capabilities. To address these limitations, we present T$^2$: Think-to-Think, a novel framework that dynamically adapts reasoning depth based on question complexity. |
ZHENGYI ZHAO et. al. | arxiv-cs.CL | 2025-05-22 |
| 498 | Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MMDocRAG, a comprehensive benchmark featuring 4,055 expert-annotated QA pairs with multi-page, cross-modal evidence chains. |
KUICAI DONG et. al. | arxiv-cs.IR | 2025-05-22 |
| 499 | O2-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs), despite their advancements, are fundamentally limited by their static parametric knowledge, hindering performance on tasks requiring open-domain … |
JIANBIAO MEI et. al. | ArXiv | 2025-05-22 |
| 500 | CT-Agent: A Multimodal-LLM Agent for 3D CT Radiology Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing VQA systems cannot adequately handle the CT radiology question answering (CTQA) task for: (1) anatomic complexity makes CT images difficult to understand; (2) spatial relationship across hundreds slices is difficult to capture. To address these issues, this paper proposes CT-Agent, a multimodal agentic framework for CTQA. |
Yuren Mao; Wenyi Xu; Yuyang Qin; Yunjun Gao; | arxiv-cs.CV | 2025-05-22 |
| 501 | A Question-type Guided and Progressive Self-attention Network for Remote Sensing Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jiangfan Feng; Hui Wang; Shaokang Dong; | Earth Sci. Informatics | 2025-05-22 |
| 502 | Teaching Large Language Models to Maintain Contextual Faithfulness Via Synthetic Tasks and Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Notably, Dual-GRPO eliminates the need tomanually label preference data to train reward models and avoidsover-optimizing short-form generation when relying only on the synthesizedshort-form QA data. |
SHUZHENG SI et. al. | arxiv-cs.CL | 2025-05-22 |
| 503 | LiveVLM: Efficient Online Video Understanding Via Streaming-Oriented KV Cache and Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, studies predominantly focus on offline video question answering, neglecting memory usage and response speed that are essential in various real-world applications, such as Deepseek services, autonomous driving, and robotics. To mitigate these challenges, we propose $\textbf{LiveVLM}$, a training-free framework specifically designed for streaming, online video understanding and real-time interaction. |
ZHENYU NING et. al. | arxiv-cs.CV | 2025-05-21 |
| 504 | Social Bias in Popular Question-Answering Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We perform a qualitative content analysis of 30 benchmark papers and a quantitative analysis of 20 respective benchmark datasets to learn (1) who is involved in the benchmark creation, (2) how social bias is addressed or prevented, and (3) whether the demographics of the creators and annotators correspond to particular biases in the content. |
Angelie Kraft; Judith Simon; Sonja Schimmler; | arxiv-cs.CL | 2025-05-21 |
| 505 | Visual Question Answering on Multiple Remote Sensing Image Modalities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to add multiple image modalities to VQA in the particular context of remote sensing, leading to a novel task for the computer vision community. |
HICHEM BOUSSAID et. al. | arxiv-cs.CV | 2025-05-21 |
| 506 | StepSearch: Igniting LLMs Search Ability Via Step-Wise Proximal Policy Optimization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous work has explored reinforcement learning (RL) to train LLMs to perform search-based document retrieval, achieving notable improvements in QA performance, but underperform on complex, multi-hop QA resulting from the sparse rewards from global signal only. To address this gap in existing research, we introduce StepSearch, a framework for search LLMs that trained with step-wise proximal policy optimization method. |
ZILIANG WANG et. al. | arxiv-cs.CL | 2025-05-21 |
| 507 | KaFT: Knowledge-aware Fine-tuning for Boosting LLMs’ Domain-specific Question-Answering Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that 1) training samples with varied conflicts contribute differently, where SFT on the data with large conflicts leads to catastrophic performance drops; 2) compared to directly filtering out the conflict data, appropriately applying the conflict data would be more beneficial. Motivated by this, we propose a simple-yet-effective Knowledge-aware Fine-tuning (namely KaFT) approach to effectively boost LLMs’ performance. |
QIHUANG ZHONG et. al. | arxiv-cs.CL | 2025-05-21 |
| 508 | ViQAgent: Zero-Shot Video Question Answering Via Agent with Open-Vocabulary Grounding Validation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent advancements in Video Question Answering (VideoQA) have introduced LLM-based agents, modular frameworks, and procedural solutions, yielding promising results. |
Tony Montes; Fernando Lozano; | arxiv-cs.CV | 2025-05-21 |
| 509 | CRAFT: Training-Free Cascaded Retrieval for Tabular QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach achieves better retrieval performance than state-of-the-art (SOTA) sparse, dense, and hybrid retrievers. |
Adarsh Singh; Kushal Raj Bhandari; Jianxi Gao; Soham Dan; Vivek Gupta; | arxiv-cs.CL | 2025-05-20 |
| 510 | QA-prompting: Improving Summarization with Large Language Models Using Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, models often struggle with long-context summarization due topositional biases, leading to suboptimal extraction of critical information.There are techniques to improve this with fine-tuning, pipelining, or usingcomplex techniques, which have their own challenges. To solve these challenges,we propose QA-prompting – a simple prompting method for summarization thatutilizes question-answering as an intermediate step prior to summarygeneration. |
Neelabh Sinha; | arxiv-cs.CL | 2025-05-20 |
| 511 | Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Mujica, aMulti-hop Joint Intelligence for Complex Question Answering, comprising aplanner that decomposes questions into a directed acyclic graph of subquestionsand a worker that resolves questions via retrieval and reasoning. |
YIHONG WU et. al. | arxiv-cs.CL | 2025-05-20 |
| 512 | VoQA: Visual-only Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Visual-only Question Answering (VoQA), a novel multimodal task in which questions are visually embedded within images, without any accompanying textual input. |
Luyang Jiang; Jianing An; Jie Luo; Wenjun Wu; Lei Huang; | arxiv-cs.CV | 2025-05-20 |
| 513 | Automatic Dataset Generation for Knowledge Intensive Question Answering Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A novel approach for enhancing Large Language Models (LLMs) in knowledge-intensive QA tasks is presented through the automated generation of context-based QA pairs. |
Sizhe Yuen; Ting Su; Ziyang Wang; Yali Du; Adam J. Sobey; | arxiv-cs.CL | 2025-05-20 |
| 514 | Beyond Chains: Bridging Large Language Models and Knowledge Bases in Complex Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by semantic parsing methods, we propose PDRR: a four-stage framework consisting of Predict, Decompose, Retrieve, and Reason. |
Yihua Zhu; Qianying Liu; Akiko Aizawa; Hidetoshi Shimodaira; | arxiv-cs.CL | 2025-05-20 |
| 515 | Texts or Images? A Fine-grained Analysis on The Effectiveness of Input Representations and Models for Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct the first controlled study on the effectiveness of several combinations of table representations and models from two perspectives: question complexity and table size. |
Wei Zhou; Mohsen Mesgar; Heike Adel; Annemarie Friedrich; | arxiv-cs.CL | 2025-05-20 |
| 516 | YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce YESciEval, an open-source framework that combines fine-grained rubric-based assessment with reinforcement learning to mitigate optimism bias in LLM evaluators. |
Jennifer D’Souza; Hamed Babaei Giglou; Quentin Münch; | arxiv-cs.CL | 2025-05-20 |
| 517 | Memory-Centric Embodied Question Answer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a memory-centric EQA framework named MemoryEQA. |
Mingliang Zhai; Zhi Gao; Yuwei Wu; Yunde Jia; | arxiv-cs.CL | 2025-05-20 |
| 518 | Domain Adaptation of VLM for Soccer Video Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision Language Models (VLMs) have demonstrated strong performance inmulti-modal tasks by effectively aligning visual and textual representations.However, most video understanding VLM research has been domain-agnostic,leaving the understanding of their transfer learning capability to specializeddomains under-explored. In this work, we address this by exploring theadaptability of open-source VLMs to specific domains, and focusing on soccer asan initial case study. |
Tiancheng Jiang; Henry Wang; Md Sirajus Salekin; Parmida Atighehchian; Shinan Zhang; | arxiv-cs.CV | 2025-05-19 |
| 519 | Q${}^2$Forge: Minting Competency Questions and SPARQL Queries for Question-Answering Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Q${}^2$Forge that addresses the challenge of generating new competency questions for a KG and corresponding SPARQL queries. |
Yousouf Taghzouti; Franck Michel; Tao Jiang; Louis-Félix Nothias; Fabien Gandon; | arxiv-cs.DB | 2025-05-19 |
| 520 | Structured Retrieval-Augmented Generation for Multi-Entity Question Answering Over Heterogeneous Sources Related Papers Related Patents Related Grants Related Venues Related Experts View |
Teng Lin; | 2025 IEEE 41st International Conference on Data Engineering … | 2025-05-19 |
| 521 | FeVisQA: Free-Form Question Answering Over Data Visualizations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new task named FeVisQA, referring to Free-form Question Answering over data Visualizations. |
YUANFENG SONG et. al. | icde | 2025-05-19 |
| 522 | AMAQA: A Metadata-based QA Dataset for RAG Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Retrieval-augmented generation (RAG) systems are widely used in question-answering (QA) tasks, but current benchmarks lack metadata integration, hindering evaluation in scenarios requiring both textual data and external information. To address this, we present AMAQA, a new open-access QA dataset designed to evaluate tasks combining text and metadata. |
Davide Bruni; Marco Avvenuti; Nicola Tonellotto; Maurizio Tesconi; | arxiv-cs.IR | 2025-05-19 |
| 523 | Disambiguation in Conversational Question Answering in The Era of LLMs and Agents: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By offering acomprehensive review of current research on ambiguities and disambiguation withLLMs, we aim to contribute to the development of more robust and reliableLLM-based systems. |
MD MEHRAB TANJIM et. al. | arxiv-cs.CL | 2025-05-18 |
| 524 | Enhancing Large Language Models with Reward-guided Tree Search for Knowledge Graph Question and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, the complex semantics contained in questions may lead to the retrieval of inaccurate reasoning paths. To address these issues, this paper proposes a novel and training-free framework for KGQA tasks called Reward-guided Tree Search on Graph (RTSoG). |
XIAO LONG et. al. | arxiv-cs.CL | 2025-05-18 |
| 525 | SurveillanceVQA-589K: A Benchmark for Comprehensive Surveillance Video-Language Understanding with Large Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce SurveillanceVQA-589K, the largest open-ended video question answering benchmark tailored to the surveillance domain. |
BO LIU et. al. | arxiv-cs.CV | 2025-05-18 |
| 526 | Beyond Retrieval: Joint Supervision and Multimodal Document Ranking for Textbook Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach to multimodal textbook question answering by introducing a mechanism for enhancing semantic representations through multi-objective joint training. |
Hessa Alawwad; Usman Naseem; Areej Alhothali; Ali Alkhathlan; Amani Jamal; | arxiv-cs.IR | 2025-05-17 |
| 527 | Recursive Question Understanding for Complex Question Answering Over Heterogeneous Personal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ReQAP, a novel method that creates an executable operator tree for a given question, via recursive decomposition. |
Philipp Christmann; Gerhard Weikum; | arxiv-cs.CL | 2025-05-17 |
| 528 | A Dataset for Spatiotemporal-Sensitive POI Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing Question-Answering (QA) datasets lack sufficient spatiotemporal-sensitive questions, making them inadequate benchmarks for evaluating models’ spatiotemporal reasoning capabilities. To address this gap, we introduce POI-QA, a novel spatiotemporal-sensitive QA dataset centered on Point of Interest (POI), constructed through three key steps: mining and aligning open-source vehicle trajectory data from GAIA with high-precision geographic POI data, rigorous manual validation of noisy spatiotemporal facts, and generating bilingual (Chinese/English) QA pairs that reflect human-understandable spatiotemporal reasoning tasks. |
XIAO HAN et. al. | arxiv-cs.LG | 2025-05-16 |
| 529 | THELMA: Task Based Holistic Evaluation of Large Language Model Applications-RAG Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose THELMA (Task Based Holistic Evaluation of Large Language Model Applications), a reference free framework for RAG (Retrieval Augmented generation) based question answering (QA) applications. |
UDITA PATEL et. al. | arxiv-cs.CL | 2025-05-16 |
| 530 | Semantic Caching of Contextual Summaries for Efficient Question-Answering with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel semantic caching approach for storing and reusing intermediate contextual summaries, enabling efficient information reuse across similar queries in LLM-based QA workflows. |
Camille Couturier; Spyros Mastorakis; Haiying Shen; Saravan Rajmohan; Victor Rühle; | arxiv-cs.CL | 2025-05-16 |
| 531 | CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they still face challenges in balancing retrieval precision and recall, impacting their efficacy in answering questions. To address this, we introduce $\textbf{CAFE}$, a two-stage coarse-to-fine method to enhance multi-document question-answering capacities. |
Han Peng; Jinhao Jiang; Zican Dong; Wayne Xin Zhao; Lei Fang; | arxiv-cs.CL | 2025-05-15 |
| 532 | Enhancing Multi-Image Question Answering Via Submodular Subset Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an enhancement for retriever framework introduced in MIRAGE model using submodular subset selection techniques. |
Aaryan Sharma; Shivansh Gupta; Samar Agarwal; Vishak Prasad C.; Ganesh Ramakrishnan; | arxiv-cs.CV | 2025-05-15 |
| 533 | PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts Into Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Parameter-efficient fine-tuning (PEFT) methods have shown promise in adapting large language models, yet existing approaches exhibit counter-intuitive phenomena: integrating router into prompt tuning (PT) increases training efficiency yet does not improve performance universally; parameter reduction through matrix decomposition can improve performance in specific domains. Motivated by these observations and the modular nature of PT, we propose PT-MoE, a novel framework that integrates matrix decomposition with mixture-of-experts (MoE) routing for efficient PT. |
Zongqian Li; Yixuan Su; Nigel Collier; | arxiv-cs.CL | 2025-05-14 |
| 534 | WixQA: A Multi-Dataset Benchmark for Enterprise Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Critically, evaluating end-to-end RAG systems requires benchmarks comprising not only question–answer pairs but also the specific knowledge base (KB) snapshot from which answers were derived. To address this need, we introduce WixQA, a benchmark suite featuring QA datasets precisely grounded in the released KB corpus, enabling holistic evaluation of retrieval and generation components. |
DVIR COHEN et. al. | arxiv-cs.AI | 2025-05-13 |
| 535 | Efficient and Reproducible Biomedical Question Answering Using Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study systematically examines a Retrieval-Augmented Generation (RAG) system for biomedical QA, evaluating retrieval strategies and response time trade-offs. |
Linus Stuhlmann; Michael Alexander Saxer; Jonathan Fürst; | arxiv-cs.IR | 2025-05-12 |
| 536 | Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Task 5 of the DCASE 2025 Challenge: an Audio Question Answering (AQA) benchmark spanning multiple domains of sound understanding. |
CHAO-HAN HUCK YANG et. al. | arxiv-cs.SD | 2025-05-12 |
| 537 | NeoQA: Evidence-based Question Answering with Generated News Events Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Questions initially requiring retrieval may become answerable from pretraining knowledge as newer models incorporate more recent information during pretraining, making it difficult to distinguish evidence-based reasoning from recall. We introduce NeoQA (News Events for Out-of-training Question Answering), a benchmark designed to address this issue. |
Max Glockner; Xiang Jiang; Leonardo F. R. Ribeiro; Iryna Gurevych; Markus Dreyer; | arxiv-cs.CL | 2025-05-09 |
| 538 | ChartCitor: Answer Citations for ChartQA Via Multi-Agent LLM Retrieval Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) can perform chart question answering tasks but often generate unverified hallucinated responses. Existing answer attribution methods struggle to … |
Kanika Goswami; Puneet Mathur; R. Rossi; Franck Dernoncourt; | Companion Proceedings of the ACM on Web Conference 2025 | 2025-05-08 |
| 539 | IP-VQA Dataset: Empowering Precision Agriculture with Autonomous Insect Pest Management Through Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Precision agriculture is essential for social good, global economy and food security, yet insect pests threaten productivity through crop damage, pathogen spread, and rising pest … |
Kairui Jin; Xing Zi; Karthick Thiyagarajan; Ali Braytee; Mukesh Prasad; | Companion Proceedings of the ACM on Web Conference 2025 | 2025-05-08 |
| 540 | Fine-Tuning Large Language Models and Evaluating Retrieval Methods for Improved Question Answering on Building Codes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study focuses on identifying a suitable retriever method for building codes and optimizing the generational capability of the language model using fine-tuning techniques. |
Mohammad Aqib; Mohd Hamza; Qipei Mei; Ying Hei Chui; | arxiv-cs.CL | 2025-05-07 |
| 541 | A Reasoning-Focused Legal Retrieval Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: An obstacle to the development of specialized RAG systems is the lack of realistic legal RAG benchmarks which capture the complexity of both legal retrieval and downstream legal question-answering. To address this, we introduce two novel legal RAG benchmarks: Bar Exam QA and Housing Statute QA. |
LUCIA ZHENG et. al. | arxiv-cs.CL | 2025-05-06 |
| 542 | IndicSQuAD: A Comprehensive Multilingual Question Answering Dataset for Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present IndicSQuAD, a comprehensive multi-lingual extractive QA dataset covering nine major Indic languages, systematically derived from the SQuAD dataset. |
Sharvi Endait; Ruturaj Ghatage; Aditya Kulkarni; Rajlaxmi Patil; Raviraj Joshi; | arxiv-cs.CL | 2025-05-06 |
| 543 | LLM-Driven Data Augmentation for Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Remote Sensing Visual Question Answering (RSVQA) is a task aiming at automatic answering questions related to overhead imagery. Many studies have been conducted in recent years, … |
Hichem Boussaid; Nayoung Kwon; Camille Kurtz; Laurent Wendling; Sylvain Lobry; | 2025 Joint Urban Remote Sensing Event (JURSE) | 2025-05-05 |
| 544 | Structure Causal Models and LLMs Integration in Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a causal inference framework for the MedVQA task, which effectively eliminates the relative confounding effect between the image and the question to ensure the precision of the question-answering (QA) session. |
Zibo Xu; Qiang Li; Weizhi Nie; Weijie Wang; Anan Liu; | arxiv-cs.CV | 2025-05-05 |
| 545 | CoRAC: Integrating Selective API Document Retrieval with Question Semantic Intent for Code Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a knowledge-based framework, CoRAC, an automatic code question responder that enhances understanding through selective API document retrieval and question semantic intent clustering. |
YunSeok Choi; CheolWon Na; Jee-Hyong Lee; | naacl | 2025-05-04 |
| 546 | Analyzing and Improving Coherence of Large Language Models in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the behavior of multiple LLMs, including Mixtral-8x7B, Llama2-70b, Smaug-72b, and Phi-3, when dealing with multiple lexical variations of the same info-seeking questions. |
Ivano Lauriola; Stefano Campese; Alessandro Moschitti; | naacl | 2025-05-04 |
| 547 | PeerQA: A Scientific Question Answering Dataset from Peer Reviews Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present PeerQA, a real-world, scientific, document-level Question Answering (QA) dataset. |
Tim Baumgärtner; Ted Briscoe; Iryna Gurevych; | naacl | 2025-05-04 |
| 548 | CuriousLLM: Elevating Multi-Document Question Answering with LLM-Enhanced Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CuriousLLM, an enhancement that integrates a curiosity-driven reasoning mechanism into an LLM agent. |
Zukang Yang; Zixuan Zhu; Jennifer Zhu; | naacl | 2025-05-04 |
| 549 | Diversify-verify-adapt: Efficient and Robust Retrieval-Augmented Ambiguous Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although the iterative RAG approach has been proposed to address this problem, it comes at the cost of significantly reduced efficiency. To address these issues, we propose the diversify-verify-adapt (DIVA) framework. |
YEONJUN IN et. al. | naacl | 2025-05-04 |
| 550 | MAPWise: Evaluating Vision-Language Models for Advanced Map Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the efficacy of VLMs in answering questions based on choropleth maps, which are widely used for data analysis and representation. To facilitate and encourage research in this area, we introduce a novel map-based question-answering benchmark, consisting of maps from three geographical regions (United States, India, China), each containing around 1000 questions. |
SRIJA MUKHOPADHYAY et. al. | naacl | 2025-05-04 |
| 551 | THREAD: Thinking Deeper with Recursive Spawning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models (LLMs) have shown impressive capabilities across diverse settings, but still struggle as the length and complexity of the context increases. To address this challenge, we propose Thinking Recursively and Dynamically (ThReaD). |
Philip Schroeder; Nathaniel W. Morgan; Hongyin Luo; James R. Glass; | naacl | 2025-05-04 |
| 552 | Pointwise Mutual Information As A Performance Gauge for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there is no method to date that exploits this phenomenon to improve generation. To fill this gap, in this study, we show that the pointwise mutual information between a context and a question is an effective gauge for language model performance. |
TIANYU LIU et. al. | naacl | 2025-05-04 |
| 553 | Reverse Question Answering: Can An LLM Write A Question So Hard (or Bad) That It Can’t Answer? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By finding question and answer types that lead to RQA errors, we suggest improvements for LLM reasoning. |
NISHANT BALEPUR et. al. | naacl | 2025-05-04 |
| 554 | VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities Via Single-Stage Joint Speech-Text Supervised Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Another critical challenge with SpeechLMs is catastrophic forgetting, where models optimized for speech tasks suffer significant degradation in text-only performance. To mitigate these issues, we propose a novel single-stage joint speech-text SFT approach on the low-rank adaptation (LoRA) of the LLM backbone. |
YIFAN PENG et. al. | naacl | 2025-05-04 |
| 555 | Benchmarking Large Language Models on Answering and Explaining Challenging Medical Questions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, the lack of reference explanations means we cannot easily evaluate the reasoning of model decisions, a crucial component of supporting doctors in making complex medical decisions. To address these challenges, we construct two new datasets: JAMA Clinical Challenge and Medbullets. |
Hanjie Chen; Zhouxiang Fang; Yash Singla; Mark Dredze; | naacl | 2025-05-04 |
| 556 | K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View |
Jeonghun Cho; Gary Lee; | naacl | 2025-05-04 |
| 557 | DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing zero-shot methods are efficient but failto consider context and prevent bias propagation in the answers. To address this, we propose *DeCAP*, a method for debiasing LLMs usingContext-Adaptive Prompt Generation. |
Suyoung Bae; YunSeok Choi; Jee-Hyong Lee; | naacl | 2025-05-04 |
| 558 | SUNAR: Semantic Uncertainty Based Neighborhood Aware Retrieval for Complex QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce SUNAR, a novel approach that leverages LLMs to guide a Neighborhood Aware Retrieval process. |
Venktesh V; Mandeep Rathee; Avishek Anand; | naacl | 2025-05-04 |
| 559 | From Generating Answers to Building Explanations: Integrating Multi-Round RAG and Causal Modeling for Scientific QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by findings from the social sciences, we present an implemented causal QA approach that combines iterative RAG with guidance from a formal model of causation. |
VICTOR BARRES et. al. | naacl | 2025-05-04 |
| 560 | ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel evaluation dataset, ProMQA, to measure the advancement of systems in application-oriented scenarios. |
KIMIHIRO HASEGAWA et. al. | naacl | 2025-05-04 |
| 561 | SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, adapting general-purpose RAG systems to specialized fields such as science and medicine poses unique challenges due to distribution shifts and limited access to domain-specific data. To tackle this, we propose SimRAG, a self-training approach that equips LLMs with joint capabilities of question answering and question generation for domain adaptation. |
RAN XU et. al. | naacl | 2025-05-04 |
| 562 | MoEMoE: Question Guided Dense and Scalable Sparse Mixture-of-Expert for Multi-source Multi-modal Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we formulate a novel question-answer generation (QAG) framework in an environment containing multi-source, multimodal information. |
Vinay Kumar Verma; Shreyas Sunil Kulkarni; Happy Mittal; Deepak Gupta; | naacl | 2025-05-04 |
| 563 | MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Current benchmarks primarily focus on single-chart tasks, neglecting the multi-hop reasoning required to extract and integrate information from multiple charts, which is essential in practical applications. To fill this gap, we introduce MultiChartQA, a benchmark that evaluates MLLMs’ capabilities in four key areas: direct question answering, parallel question answering, comparative reasoning, and sequential reasoning. |
Zifeng Zhu; Mengzhao Jia; Zhihan Zhang; Lang Li; Meng Jiang; | naacl | 2025-05-04 |
| 564 | Automatic Evaluation of Healthcare LLMs Beyond Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work is focused on the healthcare domain, where both factuality and discourse matter greatly. It introduces a comprehensive, multi-axis suite for healthcare LLM evaluation, exploring correlations between open and close benchmarks and metrics. |
ANNA ARIAS-DUART et. al. | naacl | 2025-05-04 |
| 565 | Generating Complex Question Decompositions in The Face of Distribution Shifts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One way of improving LLM training and fine-tuning is to leverage synthetic training data, but the superior performance of supervised approaches collapses in the face of distribution shifts, making them unsuitable for generating synthetic data across new domains and at scale. To address this, we propose an approach to generate synthetic decomposition data with only five annotated examples; we do this by (i) extending recent advancements in using LLM-as-judge and for reranking in novel ways, as well as (ii) using a panel of smaller-sized LLMs for data generation instead of resource-intensive larger models. |
Kelvin Han; Claire Gardent; | naacl | 2025-05-04 |
| 566 | Hybrid Graphs for Table-and-Text Based Question Answering Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel Hybrid Graph-based approach for Table-Text QA that leverages LLMs without fine-tuning. |
Ankush Agarwal; Chaitanya Devaguptapu; Ganesh S; | naacl | 2025-05-04 |
| 567 | VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose VisDoMRAG, a novel multimodal Retrieval Augmented Generation (RAG) approach that simultaneously utilizes visual and textual RAG, combining robust visual retrieval capabilities with sophisticated linguistic reasoning. |
MANAN SURI et. al. | naacl | 2025-05-04 |
| 568 | TRAVELER: A Benchmark for Evaluating Temporal Reasoning Across Vague, Implicit and Explicit References Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although existing benchmarks address a system’s ability to reason about and resolve temporal references, systematic evaluation of specific temporal references remains limited. Towards closing this gap, we introduce TRAVELER, a novel synthetic benchmark dataset that follows a Question Answering paradigm and consists of questions involving temporal references with the corresponding correct answers. |
Svenja Kenneweg; Jörg Deigmöller; Philipp Cimiano; Julian Eggert; | arxiv-cs.CL | 2025-05-02 |
| 569 | Interaction Configurations and Prompt Guidance in Conversational AI for Question Answering in Human-AI Teams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Extending from our initialformative study, which revealed challenges in human utilization ofconversational AI support, we designed two configurations for prompt guidance:a Nudging approach, where the AI suggests potential responses for human agents,and a Highlight strategy, emphasizing crucial parts of reference documents toaid human responses. |
JAEYOON SONG et. al. | arxiv-cs.HC | 2025-05-02 |
| 570 | Exp-VQA: Fine-grained Facial Expression Analysis Via Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yujian Yuan; Jiabei Zeng; Shiguang Shan; | Pattern Recognit. | 2025-05-01 |
| 571 | Retrieval Augmented Generation-driven Information Retrieval and Question Answering in Construction Management IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
CHENGKE WU et. al. | Adv. Eng. Informatics | 2025-05-01 |
| 572 | Augmenting General-purpose Large-language Models with Domain-specific Multimodal Knowledge Graph for Question-answering in Construction Project Management Related Papers Related Patents Related Grants Related Venues Related Experts View |
SHENGHUA ZHOU et. al. | Adv. Eng. Informatics | 2025-05-01 |
| 573 | A Resilient Generative Model in Few-shot Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Anqi Zou; Yanping Chen; Ruizhang Huang; Yongbin Qin; | Knowl. Based Syst. | 2025-05-01 |
| 574 | Simulating Question-answering Correctness with A Conditional Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a method called Diffusion-based Simulator (DSim), which takes advantage of diffusion to alleviate the bias accumulation. |
Ting Long; Li’ang Yin; Yi Chang; Wei Xia; Yong Yu; | www | 2025-04-30 |
| 575 | ConSens: Assessing Context Grounding in Open-book Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing evaluation methods, primarily based on the LLM-as-a-judge approach, face significant limitations, including biases, scalability issues, and dependence on costly external systems. To address these challenges, we propose a novel metric that contrasts the perplexity of the model response under two conditions: when the context is provided and when it is not. |
Ivan Vankov; Matyo Ivanov; Adriana Correia; Victor Botev; | arxiv-cs.CL | 2025-04-30 |
| 576 | Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing medical RAG systems suffer from two key limitations: (1) a lack of modeling for human-like reasoning behaviors during information retrieval, and (2) reliance on suboptimal medical corpora, which often results in the retrieval of irrelevant or noisy snippets. To overcome these challenges, we propose Discuss-RAG, a plug-and-play module designed to enhance the medical QA RAG system through collaborative agent-based reasoning. |
XUANZHAO DONG et. al. | arxiv-cs.CL | 2025-04-29 |
| 577 | Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For domain-specific tasks, it is not clear if teacher or student model, or both, must be considered for domain adaptation. In this work, we study this problem from perspective of telecom domain Question-Answering (QA) task. |
Rishika Sen; Sujoy Roychowdhury; Sumit Soman; H. G. Ranjani; Srikhetra Mohanty; | arxiv-cs.CL | 2025-04-28 |
| 578 | RepoChat: An LLM-Powered Chatbot for GitHub Repository Question-Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Software repositories contain a wealth of data about the software development process, such as source code, documentation, issue tracking, and commit histories. However, accessing … |
Samuel Abedu; Laurine Menneron; S. Khatoonabadi; Emad Shihab; | 2025 IEEE/ACM 22nd International Conference on Mining … | 2025-04-28 |
| 579 | VideoMultiAgents: A Multi-Agent Framework for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, many existing methods rely on feeding frame-level captions into a single model, making it difficult to adequately capture temporal and interactive contexts. To address this limitation, we introduce VideoMultiAgents, a framework that integrates specialized agents for vision, scene graph analysis, and text processing. |
NORIYUKI KUGO et. al. | arxiv-cs.CV | 2025-04-25 |
| 580 | FinBERT-QA: Financial Question Answering with Pre-trained BERT Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Motivated by the emerging demand in the financial industry for the automatic analysis of unstructured and structured data at scale, Question Answering (QA) systems can provide … |
Bithiah Yuan; | ArXiv | 2025-04-24 |
| 581 | A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite substantial progress, no comprehensive survey currently exists that systematically organizes and reviews the existing KB-VQA methods. This survey aims to fill this gap by establishing a structured taxonomy of KB-VQA approaches, and categorizing the systems into main stages: knowledge representation, knowledge retrieval, and knowledge reasoning. |
Jiaqi Deng; Zonghan Wu; Huan Huo; Guandong Xu; | arxiv-cs.CV | 2025-04-24 |
| 582 | Credible Plan-Driven RAG Method for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: ExistingRAG methods often suffer from deviations in reasoning paths and cumulativeerrors in intermediate steps, reducing the fidelity of the final answer. Toaddress these limitations, we propose PAR-RAG (Plan-then-Act-and-Review RAG), anovel framework inspired by the PDCA (Plan-Do-Check-Act) cycle, to enhance boththe accuracy and factual consistency in multi-hop question answering.Specifically, PAR-RAG selects exemplars matched by the semantic complexity ofthe current question to guide complexity-aware top-down planning, resulting inmore precise and coherent multi-step reasoning trajectories. |
NINGNING ZHANG et. al. | arxiv-cs.CL | 2025-04-23 |
| 583 | TraveLLaMA: Facilitating Multi-modal Large Language Models to Understand Urban Scenes and Provide Travel Assistance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present TraveLLaMA, a specialized multimodal language model designed for urban scene understanding and travel assistance. |
MENG CHU et. al. | arxiv-cs.CV | 2025-04-23 |
| 584 | FinDER: Financial Dataset for Question Answering and Evaluating Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By challenging models to retrieve relevantinformation from large corpora rather than relying on readily determinedcontexts, FinDER offers a more realistic benchmark for evaluating RAG systems.We further present a comprehensive evaluation of multiple state-of-the-artretrieval models and Large Language Models, showcasing challenges derived froma realistic benchmark to drive future research on truthful and precise RAG inthe financial domain. |
CHANYEOL CHOI et. al. | arxiv-cs.IR | 2025-04-22 |
| 585 | LLM-KGMQA: Large Language Model-augmented Multi-hop Question-answering System Based on Knowledge Graph in Medical Field Related Papers Related Patents Related Grants Related Venues Related Experts View |
FEILONG WANG et. al. | Knowl. Inf. Syst. | 2025-04-21 |
| 586 | Efficient Document Retrieval with G-Retriever Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an enhanced approach that replaces the PCST method with an attention-based sub-graph construction technique, enabling more efficient and context-aware retrieval. |
Manthankumar Solanki; | arxiv-cs.LG | 2025-04-21 |
| 587 | A Hierarchical Framework for Measuring Scientific Paper Innovation Via Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose HSPIM, a hierarchical and training-free framework based on large language models (LLMs). |
Hongming Tan; Shaoxiong Zhan; Fengwei Jia; Hai-Tao Zheng; Wai Kin Chan; | arxiv-cs.CL | 2025-04-20 |
| 588 | Long-context Non-factoid Question Answering in Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores context-shortening techniques, including Open Information Extraction (OIE), coreference resolution, Answer Paragraph Selection (APS), and their combinations, to improve QA performance. |
Ritwik Mishra; Rajiv Ratn Shah; Ponnurangam Kumaraguru; | arxiv-cs.CL | 2025-04-18 |
| 589 | LLM-as-a-Judge: Reassessing The Performance of LLMs in Extractive QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we reassess the performance of QA models using LLM-as-a-judge across four reading comprehension QA datasets. |
Xanh Ho; Jiahao Huang; Florian Boudin; Akiko Aizawa; | arxiv-cs.CL | 2025-04-16 |
| 590 | Mitigating LLM Hallucinations with Knowledge Graphs: A Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research paper provides learning outcomes from a case study with LinkQ, an open-source natural language interface that was developed to combat hallucinations by forcing an LLM to query a knowledge graph (KG) for ground-truth data during question-answering (QA). |
Harry Li; Gabriel Appleby; Kenneth Alperin; Steven R Gomez; Ashley Suh; | arxiv-cs.HC | 2025-04-16 |
| 591 | Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a three-stage (pre-training, fine-tuning and re-ranking) framework for legal QA (called PFR-LQA), which promotes the fine-grained text representation learning and boosts the performance of dense retrieval with the dual-encoder architecture. |
S. Ni; H. Cheng; M. Yang; | icassp | 2025-04-15 |
| 592 | MQAD: A Large-Scale Question Answering Dataset for Training Music Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MQAD, a music QA dataset built on the Million Song Dataset (MSD), encompassing a rich array of musical features – including beat, chord, key, structure, instrument, and genre — across 270,000 tracks, featuring nearly 3 million diverse questions and captions. |
Z. OUYANG et. al. | icassp | 2025-04-15 |
| 593 | Multi-Prototype Grouping for Continual Learning in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ProtoGroup, a multi-prototype grouping continual learning framework for VQA, which groups prototypes based on their similarity to obtain more accurate and stable sample-invariant features. |
L. Zhang; Z. Mao; Y. Peng; Z. Fu; Y. Zhang; | icassp | 2025-04-15 |
| 594 | Electrocardiogram Report Generation and Question Answering Via Retrieval-Augmented Self-Supervised Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Interpreting electrocardiograms (ECGs) and generating comprehensive reports remain challenging tasks in cardiology, often requiring specialized expertise and significant time investment. To address these critical issues, we propose ECG-ReGen, a retrieval-based approach for ECG-to-text report generation and question answering. |
J. Tang; T. Xia; Y. Lu; C. Mascolo; A. Saeed; | icassp | 2025-04-15 |
| 595 | Visual Entity-Centric Prompting for Knowledge Retrieval in Knowledge-based VQA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a visual entity-centric prompting for knowledge retrieval (VEPR) to bridge the gap between the implicit and explicit knowledge driven by visual entities via large language models. |
J. YANG et. al. | icassp | 2025-04-15 |
| 596 | A Hierarchical Reasoning Framework for Complex Question Answering Over Knowledge Graph with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance model interpretability, reinforcement learning based methods are introduced. |
Z. Zhang; Z. Zhang; Y. Zhang; W. Zhao; | icassp | 2025-04-15 |
| 597 | Constraint-Awareness and Graph Reasoning for Temporal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thirdly, a gap exists between the vector spaces of KG embeddings and question representations, and a simplistic hard concatenation or fusion of these two can lead to suboptimal solutions. To address these shortcomings, we propose a temporally aware QA method named Constraint-Awareness and Graph Reasoning. |
Z. Sun; K. Zhang; X. Zhang; J. Liu; | icassp | 2025-04-15 |
| 598 | SiQA: A Large Multi-Modal Question Answering Model for Structured Images Based on RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SiQA, a knowledge construction and Retrieval-Augmented Generation(RAG)-based multimodal Question-Answering model designed for Structured Images. |
J. Liu; Y. Tao; F. Wang; H. Li; X. Qin; | icassp | 2025-04-15 |
| 599 | Exploring The Role of Knowledge Graph-Based RAG in Japanese Medical Question Answering with Small-Scale LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) perform well in medical QA, but their effectiveness in Japanese contexts is limited due to privacy constraints that prevent the use of commercial models like GPT-4 in clinical settings. |
YINGJIAN CHEN et. al. | arxiv-cs.CL | 2025-04-15 |
| 600 | Ai2 Scholar QA: Organized Literature Synthesis with Attribution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Ai2 Scholar QA, a free onlinescientific question answering application. |
AMANPREET SINGH et. al. | arxiv-cs.CL | 2025-04-15 |
| 601 | Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a text-to-speech (TTS) system powered by a fine-tuned Llama model, named TTS-Llama, that achieves state-of-the-art speech synthesis performance. |
M. Shen; | icassp | 2025-04-15 |
| 602 | Leveraging Chain of Thought Towards Empathetic Spoken Dialogue Without Corresponding Question-Answering Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we propose a novel approach that circumvents the need for question-answering data, called Listen, Perceive, and Express (LPE). |
J. Xie; | icassp | 2025-04-15 |
| 603 | Bridging Neural and Symbolic Reasoning: A Dual-System Framework for Interpretable Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLMs often lack transparency in their reasoning processes and struggle with hallucination. To overcome these challenges, we propose Dual-NeSy, a Dual-system framework that integrates Neural networks with Symbolic logic for interpretable question answering. |
J. Shi; X. Ding; H. Zhao; T. Liu; B. Qin; | icassp | 2025-04-15 |
| 604 | Speech Retrieval-Augmented Generation Without Automatic Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While this cascaded pipeline has proven effective in many practical settings, ASR errors can propagate to the retrieval and generation steps. To overcome this limitation, we introduce SpeechRAG, a novel framework designed for open-question answering over spoken data. |
D. J. MIN et. al. | icassp | 2025-04-15 |
| 605 | Audiopedia: Audio QA with Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Audiopedia, a novel task called Audio Question Answering with Knowledge, which requires both audio comprehension and external knowledge reasoning. |
A. S. Penamakuri; K. Chhatre; A. Jain; | icassp | 2025-04-15 |
| 606 | A Continual Learning Approach for Embodied Question Answering with Generative Adversarial Imitation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we proposed a continual learning method based on generative adversarial imitation learning and self-supervision to support the agent when facing unseen environments. |
X. ZENG et. al. | icassp | 2025-04-15 |
| 607 | AskQE: Question Answering As Automatic Evaluation for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: How can a monolingual English speaker determine whether an automatictranslation in French is good enough to be shared? Existing MT error detectionand quality estimation (QE) techniques do not address this practical scenario.We introduce AskQE, a question generation and answering framework designed todetect critical MT errors and provide actionable feedback, helping users decidewhether to accept or reject MT outputs even without the knowledge of the targetlanguage. |
Dayeon Ki; Kevin Duh; Marine Carpuat; | arxiv-cs.CL | 2025-04-15 |
| 608 | Seek and Solve Reasoning for Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper reveals that the reasoning process during task simplification may be more valuable than the simplified tasks themselves and aims to improve TQA performance by leveraging LLMs’ reasoning capabilities. We propose a Seek-and-Solve pipeline that instructs the LLM to first seek relevant information and then answer questions, integrating these two stages at the reasoning level into a coherent Seek-and-Solve Chain of Thought (SS-CoT). |
R. Jiang; C. Wang; W. Deng; | icassp | 2025-04-15 |
| 609 | Composable NLP Workflows for BERT-based Ranking and QA System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we built an end-to-end Ranking and Question-Answering (QA) system using Forte, a toolkit that makes composable NLP pipelines. |
Gaurav Kumar; Murali Mohana Krishna Dandu; | arxiv-cs.CL | 2025-04-12 |
| 610 | Multimodal Commonsense Knowledge Distillation for Visual Question Answering (Student Abstract) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Existing Multimodal Large Language Models (MLLMs) and Visual Language Pretrained Models (VLPMs) have shown remarkable performances in general Visual Question Answering (VQA). … |
Shuo Yang; Siwen Luo; S. Han; | AAAI Conference on Artificial Intelligence | 2025-04-11 |
| 611 | Knowledge Graph-extended Retrieval Augmented Generation for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ideally, an AI system should be both robust to missing facts as well as easy to communicate with. This paper proposes such a system that integrates LLMs and KGs without requiring training, ensuring adaptability across different KGs with minimal human effort. |
Jasper Linders; Jakub M. Tomczak; | arxiv-cs.LG | 2025-04-11 |
| 612 | VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods for Multimodal Multi-hop Question Answering (MMQA) often suffer from limited reasoning capabilities, reliance on modality conversion, and inadequate alignment between visual and textual representations. To address these limitations, this paper introduces Vision-Language Multimodal Transformer (VLMT), a unified architecture that integrates a transformer-based vision encoder with a sequence-to-sequence language model. |
Qi Zhi Lim; Chin Poo Lee; Kian Ming Lim; Kalaiarasi Sonai Muthu Anbananthen; | arxiv-cs.CV | 2025-04-11 |
| 613 | TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Tool-Augmented LLM Evaluation (TALE), a frameworkto assess LLM outputs without predetermined ground-truth answers. |
Sher Badshah; Ali Emami; Hassan Sajjad; | arxiv-cs.CL | 2025-04-09 |
| 614 | Visual Question Answering: A Survey of Methods, Datasets, Evaluation, and Challenges Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Visual question answering (VQA) is a dynamic field of research that aims to generate textual answers from given visual and question information. It is a multimodal field that has … |
Byeong Su Kim; Jieun Kim; Deokwoo Lee; Beakcheol Jang; | ACM Computing Surveys | 2025-04-08 |
| 615 | REVEAL: Relation-based Video Representation Learning for Video-Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Video-Question-Answering (VideoQA) comprises the capturing of complex visual relation changes over time, remaining a challenge even for advanced Video Language Models (VLM), i.a., because of the need to represent the visual content to a reasonably sized input for those models. To address this problem, we propose RElation-based Video rEpresentAtion Learning (REVEAL), a framework designed to capture visual relation information by encoding them into structured, decomposed representations. |
Sofian Chaybouti; Walid Bousselham; Moritz Wolter; Hilde Kuehne; | arxiv-cs.CV | 2025-04-07 |
| 616 | Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel Small Language Model(SLM)-driven system that synergizes advancements in lightweight Retrieval-Augmented Generation (RAG) and semantic-aware data structuring to enable efficient, accurate, and scalable query resolution across diverse data formats. |
Teng Lin; | arxiv-cs.DB | 2025-04-07 |
| 617 | Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering Via White-Box and Black-Box LLM Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Collab-RAG, a collaborative training framework that leverages mutual enhancement between a white-box small language model (SLM) and a blackbox large language model (LLM) for RAG. |
RAN XU et. al. | arxiv-cs.CL | 2025-04-07 |
| 618 | Advancing Egocentric Video Question Answering with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce QaEgo4Dv2 to mitigate annotation noise in QaEgo4D, enabling more reliable comparison. |
Alkesh Patel; Vibhav Chitalia; Yinfei Yang; | arxiv-cs.CV | 2025-04-06 |
| 619 | Enabling Collaborative Parametric Knowledge Calibration for Retrieval-Augmented Vision Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Tofully exploit the cross-task synergy in KB-VQA, we propose a unifiedretrieval-augmented VQA framework with collaborative parametric knowledgecalibration. |
JIAQI DENG et. al. | arxiv-cs.CV | 2025-04-05 |
| 620 | QIRL: Boosting Visual Question Answering Via Optimized Question-Image Relation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Second, they do not assess the relevance between the input question and image during inference, as no prior work has examined the degree of input relevance in debiasing studies. Motivated by these limitations, we propose a novel framework, Optimized Question-Image Relation Learning (QIRL), which employs a generation-based self-supervised learning strategy. |
QUANXING XU et. al. | arxiv-cs.CV | 2025-04-04 |
| 621 | Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) Excessive reliance on implicit learning in Transformer-based cross-modal self-attention fusion methods, which obscures crucial local semantic correlations in medical scenarios. To address these issues, this study proposes a HiCA-VQA method, including two modules: Hierarchical Prompting for fine-grained medical questions and Hierarchical Answer Decoders. |
Junkai Zhang; Bin Li; Shoujun Zhou; Yue Du; | arxiv-cs.CV | 2025-04-03 |
| 622 | Single-Pass Document Scanning for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a single-pass document scanning approach thatprocesses the entire text in linear time, preserving global coherence whiledeciding which sentences are most relevant to the query. |
WEILI CAO et. al. | arxiv-cs.CL | 2025-04-03 |
| 623 | Leveraging Static Relationships for Intra-Type and Inter-Type Message Passing in Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although methods based on static relationship reasoning have made certain progress, there are still deficiencies in the accuracy of static relationship recognition and representation, and they have not fully utilized the static relationship information in videos for in-depth reasoning and analysis. Therefore, this paper proposes a reasoning method for intra-type and inter-type message passing based on static relationships. |
Lili Liang; Guanglu Sun; | arxiv-cs.CV | 2025-04-03 |
| 624 | GeoRAG: A Question-Answering Approach from A Geographical Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents GeoRAG, a knowledge-enhanced QA framework integrating domain-specific fine-tuning and prompt engineering with Retrieval-Augmented Generation (RAG) technology to enhance geographic knowledge retrieval accuracy and user interaction. |
JIAN WANG et. al. | arxiv-cs.IR | 2025-04-02 |
| 625 | RBTM: A Hybrid Gradient Regression-Based Transformer Model for Biomedical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Suneetha Vazrala; Thayyaba Khatoon Mohammed; | Biomed. Signal Process. Control. | 2025-04-01 |
| 626 | Chart Question Answering with Multimodal Graph Representation Learning and Zero-shot Classification Related Papers Related Patents Related Grants Related Venues Related Experts View |
A. Farahani; Peyman Adibi; Sayyed Mohammad Saeed Ehsani; Hans-Peter Hutter; Alireza Darvishy; | Expert Syst. Appl. | 2025-04-01 |
| 627 | Bayesian-error-informed Contrastive Learning for Knowledge-based Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View |
Sudarshan Yerragunta; R. Prasath; G. Girish; | Comput. Electr. Eng. | 2025-04-01 |
| 628 | Chatbot Dialog Design for Improved Human Performance in Domain Knowledge Discovery Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The advent of machine learning (ML) has led to the widespread adoption of developing task-oriented dialog systems for scientific applications (e.g., science gateways) where … |
ROLAND ORUCHE et. al. | IEEE Transactions on Human-Machine Systems | 2025-04-01 |
| 629 | Complex Knowledge Base Question Answering with Difficulty-aware Active Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View |
DONG WANG et. al. | Expert Syst. Appl. | 2025-04-01 |
| 630 | Biomedical Question Answering Via Multi-Level Summarization on A Local Knowledge Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel method that utilizes propositional claims to construct a local knowledge graph from retrieved documents. |
Lingxiao Guan; Yuanhao Huang; Jie Liu; | arxiv-cs.CL | 2025-04-01 |
| 631 | Are You Really Listening? Boosting Perceptual Awareness in Music-QA Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These findings suggest existing benchmarks predominantly assess reasoning abilities rather than audio perception. To overcome this challenge, we present RUListening: Robust Understanding through Listening, a framework that enhances perceptual evaluation in Music-QA benchmarks. |
Yongyi Zang; Sean O’Brien; Taylor Berg-Kirkpatrick; Julian McAuley; Zachary Novack; | arxiv-cs.SD | 2025-03-31 |
| 632 | Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, in MCQA tasks, the absence of relevant KG knowledge for certain answer options remains a significant challenge. To address these issues, we propose Question-Aware Knowledge Graph Prompting (QAP), which incorporates question embeddings into GNN aggregation to dynamically assess KG relevance. |
Haochen Liu; Song Wang; Chen Chen; Jundong Li; | arxiv-cs.CL | 2025-03-30 |
| 633 | Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation. |
Yuelyu Ji; Rui Meng; Zhuochun Li; Daqing He; | arxiv-cs.CL | 2025-03-29 |
| 634 | FReM: A Flexible Reasoning Mechanism for Balancing Quick and Slow Thinking in Long-Context Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Quick thinking usually relies on pattern matching rather than truly understanding the query logic, which misses proper understanding. To address these issues, we propose FReM: Flexible Reasoning Mechanism, a method that adjusts reasoning depth according to the complexity of each question. |
ZHENGYI ZHAO et. al. | arxiv-cs.CL | 2025-03-29 |
| 635 | Can DeepSeek Reason Like A Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the dialogue capabilities of the DeepSeek model in robotic surgery scenarios, focusing on tasks such as Single Phrase QA, Visual QA, and Detailed Description. |
BOYI MA et. al. | arxiv-cs.CV | 2025-03-29 |
| 636 | Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents PRAISE, a pipeline-based approach for ConvQA that trains LLM adapters for each of the three subtasks. |
Magdalena Kaiser; Gerhard Weikum; | arxiv-cs.CL | 2025-03-28 |
| 637 | AskSport: Web Application for Sports Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces AskSport, a question-answering web application about sports. |
Enzo B Onofre; Leonardo M P Moraes; Cristina D Aguiar; | arxiv-cs.AI | 2025-03-26 |
| 638 | DomainCQA: Crafting Knowledge-Intensive QA from Domain-Specific Charts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Chart Question Answering (CQA) evaluates Multimodal Large Language Models(MLLMs) on visual understanding and reasoning over chart data. |
YUJING LU et. al. | arxiv-cs.CL | 2025-03-25 |
| 639 | VGAT: A Cancer Survival Analysis Framework Transitioning from Generative Visual Question Answering to Genomic Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To enablesurvival prediction using only whole-slide images (WSI), we propose theVisual-Genomic Answering-Guided Transformer (VGAT), a framework integratingVisual Question Answering (VQA) techniques for genomic modality reconstruction.By adapting VQA’s text feature extraction approach, we derive stable genomicrepresentations that circumvent dimensionality challenges in raw genomic data.Simultaneously, a cluster-based visual prompt module selectively enhancesdiscriminative WSI patches, addressing noise from unfiltered image regions.Evaluated across five TCGA datasets, VGAT outperforms existing WSI-onlymethods, demonstrating the viability of genomic-informed inference withoutsequencing. |
ZIZHI CHEN et. al. | arxiv-cs.CV | 2025-03-25 |
| 640 | DomainCQA: Crafting Expert-Level QA from Domain-Specific Charts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Chart Question Answering (CQA) benchmarks are essential for evaluating the capability of Multimodal Large Language Models (MLLMs) to interpret visual data. However, current … |
LING ZHONG et. al. | ArXiv | 2025-03-25 |
| 641 | Improved Alignment of Modalities in Large Vision Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose four training stages for aligning the vision model with the language model, in other words, the language model is given an ability to process visual inputs. |
Kartik Jangra; Aman Kumar Singh; Yashwani Mann; Geetanjali Rathee; | arxiv-cs.CV | 2025-03-25 |
| 642 | A Survey of Large Language Model Agents for Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper surveys the development of large language model (LLM)-based agents for question answering (QA). |
Murong Yue; | arxiv-cs.CL | 2025-03-24 |
| 643 | MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Visual Question Answering (VQA) requires reasoning across visual and textual modalities, yet Large Vision-Language Models (LVLMs) often lack integrated commonsense knowledge, limiting their robustness in real-world scenarios. To address this, we introduce MAGIC-VQA, a novel framework that enhances VQA by systematically integrating commonsense knowledge with LVLMs. |
Shuo Yang; Siwen Luo; Soyeon Caren Han; Eduard Hovy; | arxiv-cs.CL | 2025-03-24 |
| 644 | SUNAR: Semantic Uncertainty Based Neighborhood Aware Retrieval for Complex QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce SUNAR, a novel approach that leverages LLMs to guide a Neighborhood Aware Retrieval process. |
V Venktesh; Mandeep Rathee; Avishek Anand; | arxiv-cs.IR | 2025-03-23 |
| 645 | Joint Extraction Matters: Prompt-Based Visual Question Answering for Multi-Field Document Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the merits of extracting multiple fields jointly versus separately. |
Mengsay Loem; Taiju Hosaka; | arxiv-cs.CL | 2025-03-21 |
| 646 | MTBench: A Multimodal Time Series Benchmark for Temporal Reasoning and Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While multimodal learning has gained traction, existing multimodal time-series datasets fall short in evaluating cross-modal reasoning and complex question answering, which are essential for capturing complex interactions between narrative information and temporal patterns. To bridge this gap, we introduce Multimodal Time Series Benchmark (MTBench), a large-scale benchmark designed to evaluate large language models (LLMs) on time series and text understanding across financial and weather domains. |
JIALIN CHEN et. al. | arxiv-cs.CL | 2025-03-21 |
| 647 | Agentic Keyframe Search for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address it, we propose Agentic Keyframe Search (AKeyS), a simple yet powerful algorithm for identifying keyframes in the VideoQA task. |
Sunqi Fan; Meng-Hao Guo; Shuojin Yang; | arxiv-cs.CV | 2025-03-20 |
| 648 | MKG-Rank: Enhancing Large Language Models with Knowledge Graph for Multilingual Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have shown remarkable progress in medical question answering (QA), yet their effectiveness remains predominantly limited to English due to imbalanced multilingual training data and scarce medical resources for low-resource languages. To address this critical language gap in medical QA, we propose Multilingual Knowledge Graph-based Retrieval Ranking (MKG-Rank), a knowledge graph-enhanced framework that enables English-centric LLMs to perform multilingual medical QA. |
FEIYANG LI et. al. | arxiv-cs.CL | 2025-03-20 |
| 649 | Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PKR-QA (Procedural Knowledge Reasoning Question Answering), anew benchmark for question answering over procedural tasks that requirestructured reasoning. |
THANH-SON NGUYEN et. al. | arxiv-cs.CV | 2025-03-19 |
| 650 | Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study systematically evaluates demographic biases within medical RAG pipelines across multiple QA benchmarks, including MedQA, MedMCQA, MMLU, and EquityMedQA. |
Yuelyu Ji; Hang Zhang; Yanshan Wang; | arxiv-cs.CL | 2025-03-19 |
| 651 | Right Answer, Wrong Score: Uncovering The Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we shed light on the inconsistencies of MCQA evaluation strategies, which can lead to inaccurate and misleading model comparisons. |
FRANCESCO MARIA MOLFESE et. al. | arxiv-cs.CL | 2025-03-19 |
| 652 | EIAD: Explainable Industrial Anomaly Detection Via Multi-Modal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, the application of large multi-modal models in IAD remains in its infancy, facing challenges in balancing question-answering (QA) performance and mask-based grounding capabilities, often owing to overfitting during the fine-tuning process. To address these challenges, we propose a novel approach that introduces a dedicated multi-modal defect localization module to decouple the dialog functionality from the core feature extraction. |
Zongyun Zhang; Jiacheng Ruan; Xian Gao; Ting Liu; Yuzhuo Fu; | arxiv-cs.AI | 2025-03-18 |
| 653 | Elevating Visual Question Answering Through Implicitly Learned Reasoning Pathways in LVLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Vision-Language Models (LVLMs) have shown remarkable progress in various multimodal tasks, yet they often struggle with complex visual reasoning that requires multi-step inference. To address this limitation, we propose MF-SQ-LLaVA, a novel approach that enhances LVLMs by enabling implicit self-questioning through end-to-end training. |
Liu Jing; Amirul Rahman; | arxiv-cs.CV | 2025-03-18 |
| 654 | Synthetic Clarification and Correction Dialogues About Data-Centric Tasks — A Teacher-Student Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a novel framework for synthetically generating controlled, multi-turn conversations between a user and AI assistant for the task of table-based question answering, which can be generated from an existing dataset with fully specified table QA examples for any target domain. |
Christian Poelitz; Nick McKenna; | arxiv-cs.CL | 2025-03-18 |
| 655 | Synthetic Clarification and Correction Dialogues About Data-Centric Tasks – A Teacher-Student Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Real dialogues with AI assistants for solving data-centric tasks often follow dynamic, unpredictable paths due to imperfect information provided by the user or in the data, which … |
Christian Poelitz; Nick McKenna; | ArXiv | 2025-03-18 |
| 656 | Generalization V.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To effectively capture task-specific pretraining data frequency, we propose a novel task-gram language model, which is built by counting the co-occurrence of semantically related $n$-gram pairs from task inputs and outputs in the pretraining corpus. |
XINYI WANG et. al. | iclr | 2025-03-17 |
| 657 | VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce VideoMind, a novel video-language agent designed for temporal-grounded video understanding. |
Ye Liu; Kevin Qinghong Lin; Chang Wen Chen; Mike Zheng Shou; | arxiv-cs.CV | 2025-03-17 |
| 658 | ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result, we present *ClimaQA-Gold*, an expert-annotated benchmark dataset alongside *ClimaQA-Silver*, a large-scale, comprehensive synthetic QA dataset for climate science. |
VEERAMAKALI VIGNESH MANIVANNAN et. al. | iclr | 2025-03-17 |
| 659 | CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Due to the limitations of MCQ evaluations and the advanced reasoning abilities of MLLMs, models can often answer correctly by combining short video insights with elimination, without truly understanding the content. To bridge this gap, we introduce CG-Bench, a benchmark for clue-grounded question answering in long videos. |
GUO CHEN et. al. | iclr | 2025-03-17 |
| 660 | SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current benchmarks for video understanding typically emphasize isolated single-instance text inputs and fail to evaluate the capacity to sustain temporal reasoning throughout the entire duration of video streams. To address these limitations, we introduce SVBench, a pioneering benchmark with temporal multi-turn question-answering chains specifically designed to thoroughly assess the capabilities of streaming video understanding of current LVLMs. |
ZHENYU YANG et. al. | iclr | 2025-03-17 |
| 661 | Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These challenges often arise due to the complexity and ambiguity present in longer texts. To enhance the performance of LLMs in such scenarios, we introduce the Long Question Coreference Adaptation (LQCA) method. |
YANMING LIU et. al. | iclr | 2025-03-17 |
| 662 | MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study whether MLLMs can perceive small visual details as effectively as large ones when answering questions about images. |
Jiarui Zhang; Mahyar Khayatkhoei; Prateek Chhikara; Filip Ilievski; | iclr | 2025-03-17 |
| 663 | Streaming Video Question-Answering with In-context Video KV-Cache Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose ReKV, a novel training-free approach that enables efficient streaming video question-answering (StreamingVQA), by seamlessly integrating with existing Video Large Language Models (Video-LLMs). |
SHANGZHE DI et. al. | iclr | 2025-03-17 |
| 664 | CofCA: A STEP-WISE Counterfactual Multi-hop QA Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, current factual Multi-hop QA (MHQA) benchmarks are annotated on open-source corpora such as Wikipedia, although useful for multi-step reasoning evaluation, they show limitations due to the potential data contamination in LLMs’ pre-training stage. To address these issues, we introduce the Step-wise and Counterfactual benchmark (CofCA), a novel evaluation benchmark consisting of factual data and counterfactual data that reveals LLMs’ real reasoning abilities on multi-step reasoning and reasoning chain evaluation. |
Jian Wu; Linyi Yang; Zhen Wang; Manabu Okumura; Yue Zhang; | iclr | 2025-03-17 |
| 665 | Chain-of-Action: Faithful and Multimodal Question Answering Through Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented Question-Answering (QA). |
Zhenyu Pan; Haozheng Luo; Manling Li; Han Liu; | iclr | 2025-03-17 |
| 666 | Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a new multi-shot video understanding benchmark \dataset with detailed shot-level captions, comprehensive video summaries and question-answering pairs. |
Mingfei Han; Linjie Yang; Xiaojun Chang; Lina Yao; Heng Wang; | iclr | 2025-03-17 |
| 667 | Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Towards a solution, we introduce MIRAGE (Multi-Image Retrieval Augmented Generation), an open-source, lightweight visual-RAG framework that processes up to 10k images on a single 40G A100 GPU—far surpassing the 1k-image limit of contemporary models. |
TSUNG-HAN WU et. al. | iclr | 2025-03-17 |
| 668 | QA-Calibration of Language Model Confidence Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We argue, however, that this standard (average-case) notion of calibration is difficult to interpret for decision-making in generative QA. To address this, we generalize the standard notion of average calibration and introduce QA-calibration, which ensures calibration holds across different question-and-answer groups. |
Putra Manggala; Atalanti A. Mastakouri; Elke Kirschbaum; Shiva Kasiviswanathan; Aaditya Ramdas; | iclr | 2025-03-17 |
| 669 | GFSNet: Gaussian Fourier with Sparse Attention Network for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Xiang Shen; Dezhi Han; C. Chang; Ammar Oad; Huafeng Wu; | Artif. Intell. Rev. | 2025-03-15 |
| 670 | Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We leverage the group relative policy optimization (GRPO) algorithm to Qwen2-Audio-7B-Instruct, and our experiments demonstrated state-of-the-art performance on the MMAU Test-mini benchmark, achieving an accuracy rate of 64.5%. The main findings in this technical report are as follows: 1) The GRPO algorithm can be effectively applied to large audio language models (LALMs), even when the model has only 8.2B parameters; 2) With only 38k post-training samples, RL significantly outperforms supervised fine-tuning (SFT), indicating that RL-based approaches can be effective without large datasets; 3) The explicit reasoning process has not shown significant benefits for AQA tasks, and how to efficiently utilize deep thinking remains an open question for further research; 4) LALMs still lag far behind humans auditory-language reasoning, suggesting that the RL-based approaches warrant further exploration. |
GANG LI et. al. | arxiv-cs.SD | 2025-03-14 |
| 671 | RePanda: Pandas-powered Tabular Verification and Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce RePanda, a structured fact verification approach that translates claims into executable pandas queries, enabling interpretable and verifiable reasoning. |
Atoosa Malemir Chegini; Keivan Rezaei; Hamid Eghbalzadeh; Soheil Feizi; | arxiv-cs.LG | 2025-03-14 |
| 672 | Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This typically occurs because these models tend toprioritize self-generated content over the input context, causing them todisregard pertinent details. To address this challenge, we introduce a novelmethod called Guided Attention Map Editing (GAME), which dynamically adjustsattention maps to improve contextual relevance. |
YU WANG et. al. | arxiv-cs.CL | 2025-03-11 |
| 673 | MapQA: Open-domain Geospatial Question Answering on Map Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A major challenge in scaling geospatial QA datasets for reasoning lies in the complexity of geospatial relationships, which require integrating spatial structures, topological dependencies, and multi-hop reasoning capabilities that most text-based QA datasets lack. To address these limitations, we introduce MapQA, a novel dataset that not only provides question-answer pairs but also includes the geometries of geo-entities referenced in the questions. |
Zekun Li; Malcolm Grossman; Mihir Kulkarni; Muhao Chen; Yao-Yi Chiang; | arxiv-cs.CL | 2025-03-10 |
| 674 | Talking to GDELT Through Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we study various Retrieval Augmented Regeneration (RAG)approaches to gain an understanding of the strengths and weaknesses of eachapproach in a question-answering analysis. |
AUDUN MYERS et. al. | arxiv-cs.IR | 2025-03-10 |
| 675 | ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces ReAgent: a Reversible multi-Agent collaborative framework augmented with explicit backtracking mechanisms, enabling reversible multi-hop reasoning. |
XINJIE ZHAO et. al. | arxiv-cs.AI | 2025-03-10 |
| 676 | VisualSimpleQA: A Benchmark for Decoupled Evaluation of Large Vision-Language Models in Fact-Seeking Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current multimodal fact-seeking benchmarks primarily focus on comparing model outputs to ground truth answers, providing limited insights into the performance of modality-specific modules. To bridge this gap, we introduce VisualSimpleQA, a multimodal fact-seeking benchmark with two key features. |
YANLING WANG et. al. | arxiv-cs.CL | 2025-03-09 |
| 677 | Towards Fine-Grained Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing datasets exhibit gaps in temporal and spatial granularity, which consequently limits the capabilities of existing VideoQA methods. This paper introduces the Multi-Object Multi-Actor Question Answering (MOMA-QA) dataset, which is designed to address these shortcomings by emphasizing temporal localization, spatial relationship reasoning, and entity-centric queries. |
WEI DAI et. al. | arxiv-cs.CV | 2025-03-09 |
| 678 | Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on The Enhanced Conformal Prediction Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an enhanced CP framework for medical multiple-choice question-answering (MCQA) tasks. |
Yusong Ke; Hongru Lin; Yuting Ruan; Junya Tang; Li Li; | arxiv-cs.CL | 2025-03-07 |
| 679 | Evaluating Answer Reranking Strategies in Time-sensitive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the impact of temporal characteristics of answers in Question Answering (QA) by exploring several simple answer selection techniques. |
Mehmet Kardan; Bhawna Piryani; Adam Jatowt; | arxiv-cs.CL | 2025-03-06 |
| 680 | Question-Aware Gaussian Experts for Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes QA-TIGER, a novel framework that explicitly incorporates question information and models continuous temporal dynamics. |
HONGYEOB KIM et. al. | arxiv-cs.CV | 2025-03-06 |
| 681 | BPQA Dataset: Evaluating How Well Language Models Leverage Blood Pressures to Answer Biomedical Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is an important component of biomedical data, which can be used to train transformer-based language models (LMs) for improving healthcare delivery. |
CHI HANG et. al. | arxiv-cs.CL | 2025-03-06 |
| 682 | Chart-HQA: A Benchmark for Hypothetical Question Answering in Charts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they overlook the inherent output biases of MLLMs, where models rely on their parametric memory to answer questions rather than genuinely understanding the chart content. To address this limitation, we introduce a novel Chart Hypothetical Question Answering (HQA) task, which imposes assumptions on the same question to compel models to engage in counterfactual reasoning based on the chart content. |
XIANGNAN CHEN et. al. | arxiv-cs.CL | 2025-03-06 |
| 683 | EgoLife: Towards Egocentric Life Assistant Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: By releasing our datasets, models, and benchmarks, we aim to stimulate further research in egocentric AI assistants. |
JINGKANG YANG et. al. | arxiv-cs.CV | 2025-03-05 |
| 684 | Optimizing Open-domain Question Answering with Graph-based Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we benchmark various graph-based retrieval-augmented generation (RAG) systems across a broad spectrum of query types, including OLTP-style (fact-based) and OLAP-style (thematic) queries, to address the complex demands of open-domain question answering (QA). |
JOYCE CAHOON et. al. | arxiv-cs.IR | 2025-03-04 |
| 685 | Towards Robust Expert Finding in Community Question Answering Platforms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces TUEF, a topic-oriented user-interaction model for fair Expert Finding in Community Question Answering (CQA) platforms. |
Maddalena Amendola; Andrea Passarella; Raffaele Perego; | arxiv-cs.IR | 2025-03-04 |
| 686 | OWLViz: An Open-World Benchmark for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a challenging benchmark for the Open WorLd VISual question answering (OWLViz) task. |
THUY NGUYEN et. al. | arxiv-cs.LG | 2025-03-04 |
| 687 | EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel question-answering (QA) dataset using echocardiogram reports sourced from the Medical Information Mart for Intensive Care database. |
Lama Moukheiber; Mira Moukheiber; Dana Moukheiiber; Jae-Woo Ju; Hyung-Chul Lee; | arxiv-cs.AI | 2025-03-04 |
| 688 | Zero-Shot Complex Question-Answering on Long Scientific Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a zero-shot pipeline framework that enables social science researchers to perform question-answering tasks that are complex yet of predetermined question formats on full-length research papers without requiring machine learning expertise. |
Wanting Wang; | arxiv-cs.IR | 2025-03-04 |
| 689 | Beyond Prompting: An Efficient Embedding Framework for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thispaper, we propose EmbQA, an embedding-level framework that alleviates theseshortcomings by enhancing both the retriever and the reader. |
ZHANGHAO HU et. al. | arxiv-cs.CL | 2025-03-03 |
| 690 | SRAG: Structured Retrieval-Augmented Generation for Multi-Entity Question Answering Over Wikipedia Graph Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-entity question answering (MEQA) poses significant challenges for large language models (LLMs), which often struggle to consolidate scattered information across multiple … |
Teng Lin; Yizhang Zhu; Yuyu Luo; Nan Tang; | arxiv-cs.CL | 2025-03-03 |
| 691 | Q-NL Verifier: Leveraging Synthetic Data for Robust Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Question answering (QA) requires accurately aligning user questions with structured queries, a process often limited by the scarcity of high-quality query-natural language (Q-NL) pairs. To overcome this, we present Q-NL Verifier, an approach to generating high-quality synthetic pairs of queries and NL translations. |
Tim Schwabe; Louisa Siebel; Patrik Valach; Maribel Acosta; | arxiv-cs.CL | 2025-03-03 |
| 692 | LLMs Performance in Answering Educational Questions in Brazilian Portuguese: A Preliminary Analysis on LLMs Potential to Support Diverse Educational Needs Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Question-answering systems facilitate adaptive learning and respond to student queries, making education more responsive. Despite that, challenges such as natural language … |
LUIZ RODRIGUES et. al. | Proceedings of the 15th International Learning Analytics … | 2025-03-03 |
| 693 | Handling Language Prior and Compositional Reasoning Issues in Visual Question Answering System Related Papers Related Patents Related Grants Related Venues Related Experts View |
Souvik Chowdhury; Badal Soni; | Neurocomputing | 2025-03-01 |
| 694 | CooKie: Commonsense Knowledge-guided Mixture-of-experts Framework for Fine-grained Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Chao Wang; Jianming Yang; Yang Zhou; Xiaodong Yue; | Inf. Sci. | 2025-03-01 |
| 695 | Streaming Video Question-Answering with In-context Video KV-Cache Retrieval IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose ReKV, a novel training-free approach that enables efficient streaming video question-answering (StreamingVQA), by seamlessly integrating with existing Video Large Language Models (Video-LLMs). |
SHANGZHE DI et. al. | arxiv-cs.CV | 2025-03-01 |
| 696 | AILS-NTUA at SemEval-2025 Task 8: Language-to-Code Prompting and Error Fixing for Tabular Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present our submission to SemEval-2025 Task 8: Question Answering over Tabular Data. |
Andreas Evangelatos; Giorgos Filandrianos; Maria Lymperaiou; Athanasios Voulodimos; Giorgos Stamou; | arxiv-cs.CL | 2025-03-01 |
| 697 | Collaborative Aware Bidirectional Semantic Reasoning for Video Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Video question answering (VideoQA) is the challenging task of accurately responding to natural language questions based on a given video. Most previous methods focus on designing … |
Xize Wu; Jiasong Wu; Lei Zhu; L. Senhadji; Huazhong Shu; | IEEE Transactions on Circuits and Systems for Video … | 2025-03-01 |
| 698 | Bias-guided Margin Loss for Robust Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
YANHAN SUN et. al. | Inf. Process. Manag. | 2025-03-01 |
| 699 | Cycle-VQA: A Cycle-Consistent Framework for Robust Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
LIN FAN et. al. | Pattern Recognit. | 2025-03-01 |
| 700 | PASemiQA: Plan-Assisted Agent for Question Answering on Semi-Structured Data with Text and Relational Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing RAG methods typically focus on a single type of external data, such as vectorized text database or knowledge graphs, and cannot well handle real-world questions on semi-structured data containing both text and relational information. To bridge this gap, we introduce PASemiQA, a novel approach that jointly leverages text and relational information in semi-structured data to answer questions. |
Hansi Yang; Qi Zhang; Wei Jiang; Jianguo Li; | arxiv-cs.CL | 2025-02-28 |
| 701 | Glimpse of MCQ Based VQA in Road & Traffic Scenarios Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Multi-modal models have been a boon to the commu-nity over the last decade. The Large Language and Vision Models for Autonomous Driving (LLVM-AD) challenge [3] is a Visual … |
Athira Krishnan R; Sumukha Bg; Ambarish Parthasarathy; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-28 |
| 702 | Evaluating Multimodal Vision-Language Model Prompting Strategies for Visual Question Answering in Road Scene Understanding IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Understanding complex traffic scenes is a crucial challenge in advancing autonomous driving systems. Visual Question Answering (VQA) tasks have emerged as a promising approach to … |
Aryan Keskar; Srinivasa Perisetla; Ross Greer; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-28 |
| 703 | WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present WebFAQ, a large-scale collection of open-domain question answering datasets derived from FAQ-style schema.org annotations. |
Michael Dinzinger; Laura Caspari; Kanishka Ghosh Dastidar; Jelena Mitrović; Michael Granitzer; | arxiv-cs.CL | 2025-02-28 |
| 704 | Bisecting K-Means in RAG for Enhancing Question-Answering Tasks Performance in Telecommunications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a novel Retrieval-Augmented Generation framework explicitly designed for the telecommunication domain, focusing on datasets composed of 3GPP documents. |
Pedro Sousa; Cláudio Klautau Mello; Frank B. Morte; Luis F. Solis Navarro; | arxiv-cs.IR | 2025-02-27 |
| 705 | Exploring Rewriting Approaches for Different Conversational Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically investigate two different approaches, denoted as rewriting and fusion, on two fundamentally different generation tasks, including a text-to-text generation task and a multimodal generative task that takes as input text and generates a visualization or data table that answers the user’s question. |
MD MEHRAB TANJIM et. al. | arxiv-cs.CL | 2025-02-26 |
| 706 | Few-Shot Multilingual Open-Domain QA from 5 Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a \emph{few-shot learning} approach to synthesise large-scale multilingual data from large language models (LLMs). |
Fan Jiang; Tom Drummond; Trevor Cohn; | arxiv-cs.CL | 2025-02-26 |
| 707 | TRH2TQA: Table Recognition with Hierarchical Relationships to Table Question-Answering on Business Table Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Despite advancements in visual question answering, challenges persist with documents like financial reports, often structured in complicated tabular structures with complex … |
Pongsakorn Jirachanchaisiri; Nam Tuan Ly; Atsuhiro Takasu; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-26 |
| 708 | Winning Big with Small Models: Knowledge Distillation Vs. Self-Training for Reducing Hallucination in QA Agents Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The deployment of Large Language Models (LLMs) in customer support is constrained by hallucination-generating false information-and the high cost of proprietary models. To address … |
ASHLEY LEWIS et. al. | ArXiv | 2025-02-26 |
| 709 | Winning Big with Small Models: Knowledge Distillation Vs. Self-Training for Reducing Hallucination in Product QA Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The deployment of Large Language Models (LLMs) in customer support isconstrained by hallucination (generating false information) and the high costof proprietary models. To address these challenges, we propose aretrieval-augmented question-answering (QA) pipeline and explore how to balancehuman input and automation. |
ASHLEY LEWIS et. al. | arxiv-cs.CL | 2025-02-26 |
| 710 | AdQuestA: Knowledge-Guided Visual Question Answer Framework for Advertisements Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the rapidly evolving landscape of digital marketing, effective customer engagement through advertisements is crucial for brands. Thus, computational understanding of ads is … |
NEHA CHOUDHARY et. al. | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-26 |
| 711 | Reference-Aligned Retrieval-Augmented Question Answering Over Heterogeneous Proprietary Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Toaddress these, we propose a RAG-QA framework for internal enterprise use,consisting of: (1) a data pipeline that converts raw multi-modal documents intoa structured corpus and QA pairs, (2) a fully on-premise, privacy-preservingarchitecture, and (3) a lightweight reference matcher that links answersegments to supporting content. |
NAYOUNG CHOI et. al. | arxiv-cs.AI | 2025-02-26 |
| 712 | Putting People in LLMs’ Shoes: Generating Better Answers Via Question Rewriter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their effectiveness in QA is often undermined by the vagueness of user questions. To address this issue, we introduce single-round instance-level prompt optimization, referred to as question rewriter. |
Junhao Chen; Bowen Wang; Zhouqiang Jiang; Yuta Nakashima; | aaai | 2025-02-25 |
| 713 | Union Is Strength! Unite The Power of LLMs and MLLMs for Chart Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To unite the strengths of LLMs and MLLMs to complement each other’s limitations, we propose Synergy, a framework that unites the power of both models for CQA. |
JIAPENG LIU et. al. | aaai | 2025-02-25 |
| 714 | RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared with existing tabular question answering datasets, RETQA poses greater challenges due to three key factors: long-table structures, open-domain retrieval, and multi-domain queries. To tackle these challenges, we propose the SLUTQA framework, which integrates large language models with spoken language understanding tasks to enhance retrieval and answering accuracy. |
Zhensheng Wang; Wenmian Yang; Kun Zhou; Yiquan Zhang; Weijia Jia; | aaai | 2025-02-25 |
| 715 | Audio-Visual Adaptive Fusion Network for Question Answering Based on Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Secondly, the fusion of audio-visual information is often weighted inadequately, limiting model performance. To address the above issues, we design the Audio-Visual Adaptive Fusion Network (AVAF-Net), which uses contrastive learning to align audio-visual information temporally and spatially and adaptively adjusts fusion weights based on the question. |
Xujian Zhao; Yixin Wang; Peiquan Jin; | aaai | 2025-02-25 |
| 716 | COLUMBUS: Evaluating COgnitive Lateral Understanding Through Multiple-Choice ReBUSes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Effective problem-solving also necessitates lateral thinking, which remains understudied in AI and has not been used to test visual perception systems. To bridge this gap, we formulate visual lateral thinking as a multiple-choice question-answering task and describe a three-step taxonomy-driven methodology for instantiating task examples. |
Koen Kraaijveld; Yifan Jiang; Kaixin Ma; Filip Ilievski; | aaai | 2025-02-25 |
| 717 | FLUE: Streamlined Uncertainty Estimation for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a promising solution that converts redundancy into randomness in the extensive parameters of LLMs to quantify knowledge uncertainty. |
SHIQI GAO et. al. | aaai | 2025-02-25 |
| 718 | Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on the problem of image hallucination, where images created by TTI models fail to faithfully depict factual content. |
Youngsun Lim; Hojun Choi; Hyunjung Shim; | aaai | 2025-02-25 |
| 719 | VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Albeit progress has been made in Composed Image Retrieval (CIR), we empirically find that a certain percentage of failure retrieval results are not consistent with their relative captions. |
CHUN-MEI FENG et. al. | aaai | 2025-02-25 |
| 720 | When Open-Vocabulary Visual Question Answering Meets Causal Adapter: Benchmark and Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing VQA benchmarks predominantly adhere to a closed-set paradigm, limiting their ability to address arbitrary, unseen answers, and thus falling short in real-world scenarios. To address this limitation, we introduce the Open-Vocabulary Visual Question Answering (OVVQA) benchmark, specifically designed to evaluate models under open-world conditions by assessing their performance on both base classes (seen, common answers) and novel classes (unseen, rare answers). |
Feifei Zhang; Zhaoyi Zhang; Xi Zhang; Changsheng Xu; | aaai | 2025-02-25 |
| 721 | Fine-grained Adaptive Visual Prompt for Generative Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we introduce an Adaptive Visual Prompt Creator that adaptively generates region-level visual prompts based on image characteristics of various organs, providing fine-grained references for LLMs during answer retrieval and generation from the medical domain, thereby improving the model’s precise cross-modal localization capabilities on original images. |
Ting Yu; Zixuan Tong; Jun Yu; Ke Zhang; | aaai | 2025-02-25 |
| 722 | TrustUQA: A Trustful Framework for Unified Structured Data Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose TrustUQA, a trustful QA framework that can simultaneously support multiple types of structured data in a unified way. |
WEN ZHANG et. al. | aaai | 2025-02-25 |
| 723 | DEQA: Descriptions Enhanced Question-Answering Framework for Multimodal Aspect-Based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods use modality alignment for information interaction and fusion between images and text, but an inherent gap between these two modalities necessitates a more direct bridging mechanism to effectively connect image understanding with text content. For this, we propose the Descriptions Enhanced Question-Answering Framework (DEQA), which generates descriptions of images using GPT-4, leveraging the multimodal large language model to provide more direct semantic context of images. |
Zhixin Han; Mengting Hu; Yinhao Bai; Xunzhi Wang; Bitong Luo; | aaai | 2025-02-25 |
| 724 | Multimodal Hypothetical Summary for Retrieval-based Multi-image Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Conventional "retrieve-then-answer" pipelines often suffer from cascading errors because the training objective of QA fails to optimize the retrieval stage. To address this issue, we propose a novel method to effectively introduce and reference retrieved information into the QA. |
Peize Li; Qingyi Si; Peng Fu; Zheng Lin; Yan Wang; | aaai | 2025-02-25 |
| 725 | Granularity-Adaptive Spatial Evidence Tokenization for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a granularity-adaptive spatial evidence tokenization model for video question answering. |
HAO JIANG et. al. | aaai | 2025-02-25 |
| 726 | Uncertainty Quantification in Retrieval Augmented Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose to quantify the uncertainty of a QA modelvia estimating the utility of the passages it is provided with. |
Laura Perez-Beltrachini; Mirella Lapata; | arxiv-cs.CL | 2025-02-25 |
| 727 | Core-to-Global Reasoning for Compositional Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the value and availability of semantic features, we propose a novel core-to-global reasoning (CTGR) model for compositional VQA. |
Hao Zhou; Tingjin Luo; Zhangqi Jiang; | aaai | 2025-02-25 |
| 728 | Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing video question-answering (VidQA) benchmarks and datasets often exhibit a bias toward a single modality, despite the goal of requiring advanced reasoning skills that integrate diverse modalities to answer the queries. In this work, we introduce the modality importance score (MIS) to identify such bias. |
JEAN PARK et. al. | aaai | 2025-02-25 |
| 729 | Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing debiasing methods such as Translation Chain-of-Thought or Retrieval-augmented generation (RAG) by themselves fall short of closing this performance gap, often failing to improve performance across all languages and lacking specificity for the medical domain. To address this issue, We propose CLARA (Cross-Lingual Reflective Agentic system), a novel inference time de-biasing method leveraging retrieval augmented generation and self-verification. |
DAVID RESTREPO et. al. | aaai | 2025-02-25 |
| 730 | Explore What LLM Does Not Know in Complex Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we propose a novel Question Answering with Knowledge Evaluation (KEQA) framework to promote the effectiveness and efficiency of RAG in QA. |
Xin Lin; Zhenya Huang; Zhiqiang Zhang; Jun Zhou; Enhong Chen; | aaai | 2025-02-25 |
| 731 | Citations and Trust in LLM Generated Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explored trust through an anti-monitoring framework, where trust is predicted to be correlated with presence of citations and inversely related to checking citations. |
YIFAN DING et. al. | aaai | 2025-02-25 |
| 732 | Towards Robust Visual Question Answering Via Prompt-Driven Geometric Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current VQA models still struggle with the challenges of minority class collapse and spurious semantic correlations posed by language bias and imbalanced distributions. To address these challenges, this paper proposes a novel Prompt-Driven Geometric Harmonization (PDGH) paradigm, which integrates both geometric structure and information entropy principles to enhance the ability of VQA models to generalize effectively across diverse scenarios. |
YISHU LIU et. al. | aaai | 2025-02-25 |
| 733 | Patch-level Sounding Object Tracking for Audio-Visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new Patch-level Sounding Object Tracking (PSOT) method. |
ZHANGBIN LI et. al. | aaai | 2025-02-25 |
| 734 | Towards A Multimodal Large Language Model with Pixel-Level Insight for Biomedicine IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel end-to-end multimodal large language model for the biomedical domain, named MedPLIB, which possesses pixel-level understanding. |
XIAOSHUANG HUANG et. al. | aaai | 2025-02-25 |
| 735 | EPERM: An Evidence Path Enhanced Reasoning Model for Knowledge Graph Question and Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper reformulates the KGQA problem as a graphical model and proposes a three-stage framework named the Evidence Path Enhanced Reasoning Model (EPERM) for KGQA. |
Xiao Long; Liansheng Zhuang; Aodi Li; MingHong Yao; Shafei Wang; | aaai | 2025-02-25 |
| 736 | MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the limitations, we propose the first multilingual TATQA dataset (MULTITAT). |
Xuanliang Zhang; Dingzirui Wang; Keyan Xu; Qingfu Zhu; Wanxiang Che; | arxiv-cs.CL | 2025-02-24 |
| 737 | Evaluating Robustness of LLMs in Question Answering on Multilingual Noisy OCR Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thiswork, we conduct a comprehensive analysis of how OCR-induced noise affects theperformance of Multilingual QA Systems. |
Bhawna Piryani; Jamshid Mozafari; Abdelrahman Abdallah; Antoine Doucet; Adam Jatowt; | arxiv-cs.CL | 2025-02-23 |
| 738 | Wrong Answers Can Also Be Useful: PlausibleQA — A Large-Scale QA Dataset with Answer Plausibility Scores Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing QA datasets primarily focus on correct answers without explicit consideration of the plausibility of other candidate answers, limiting opportunity for more nuanced evaluations of models. To address this gap, we introduce PlausibleQA, a large-scale dataset comprising 10,000 questions and 100,000 candidate answers, each annotated with plausibility scores and justifications for their selection. |
Jamshid Mozafari; Abdelrahman Abdallah; Bhawna Piryani; Adam Jatowt; | arxiv-cs.CL | 2025-02-22 |
| 739 | MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work presents a novel multiple choice dataset, MHQA (Mental Health Question Answering), for benchmarking Language models (LMs). |
SURAJ RACHA et. al. | arxiv-cs.CL | 2025-02-21 |
| 740 | Exploring Advanced Techniques for Visual Question Answering: A Comprehensive Comparison Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Analyzing VQA datasets is essential for developing robust models that can handle the complexities of multimodal reasoning. |
Aiswarya Baby; Tintu Thankom Koshy; | arxiv-cs.CV | 2025-02-20 |
| 741 | Argument-Based Comparative Question Answering Evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to solve the problems standing in the way of automatic comparative question answering. |
IRINA NIKISHINA et. al. | arxiv-cs.CL | 2025-02-20 |
| 742 | Benchmarking Multimodal RAG Through A Chart-based Document Question-Answering Generation Framework Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing benchmarks primarily focus on simple image-text interactions, overlooking complex visual formats like charts that are prevalent in real-world applications. In this work, we introduce a novel task, Chart-based MRAG, to address this limitation. |
YUMING YANG et. al. | arxiv-cs.AI | 2025-02-20 |
| 743 | Quantifying Memorization and Parametric Response Rates in Retrieval-Augmented Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the extent to which multimodal retrieval-augmented VLMs memorize training data compared to baseline VLMs. |
Peter Carragher; Abhinand Jha; R Raghav; Kathleen M. Carley; | arxiv-cs.LG | 2025-02-19 |
| 744 | PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and LLMs in practical LLM usage scenarios. |
GUANGWEI LI et. al. | arxiv-cs.CL | 2025-02-19 |
| 745 | MuDAF: Long-Context Multi-Document Attention Focusing Through Contrastive Learning on Attention Heads Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Multi-Document Attention Focusing (MuDAF), a novel method that explicitly optimizes the attention distribution at the head level through contrastive learning. |
WEIHAO LIU et. al. | arxiv-cs.CL | 2025-02-19 |
| 746 | RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing retrieval approaches often overlook the importance of factual knowledge, which limits the relevance of retrieved conceptual knowledge and restricts its applicability in real-world scenarios, such as clinical decision-making based on Electronic Health Records (EHRs). This paper introduces RGAR, a recurrence generation-augmented retrieval framework that retrieves both relevant factual and conceptual knowledge from dual sources (i.e., EHRs and the corpus), allowing them to interact and refine each another. |
SICHU LIANG et. al. | arxiv-cs.CL | 2025-02-18 |
| 747 | Clinical QA 2.0- Multi-Task Learning for Answer Extraction and Categorization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Clinical Question Answering (CQA) plays a crucial role in medical decision-making, enabling physicians to extract relevant information from Electronic Medical Records (EMRs). … |
PRIYARANJAN PATTNAYAK et. al. | 2025 IEEE International Conference on Electro Information … | 2025-02-18 |
| 748 | MCTS-KBQA: Monte Carlo Tree Search for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although recent approaches leveraging LLMs as agents have demonstrated considerable potential, these studies are inherently constrained by their linear decision-making processes. To address this limitation, we propose a MCTS-based framework that enhances LLMs’ reasoning capabilities through tree search methodology. |
Guanming Xiong; Haochen Li; Wen Zhao; | arxiv-cs.CL | 2025-02-18 |
| 749 | Towards Question Answering Over Large Semi-structured Tables Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, such solutions are subject to programgeneration and execution errors and are difficult to ensure decompositionquality. To address this issue, we propose TaDRe, a TableQA model thatincorporates both pre- and post-table decomposition refinements to ensure tabledecomposition quality, hence achieving highly accurate TableQA results. |
Yuxiang Wang; Junhao Gan; Jianzhong Qi; | arxiv-cs.CL | 2025-02-18 |
| 750 | SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conventional Retrieval-Augmented Generation (RAG) techniques typically retrieve external information from static knowledge bases, which can be outdated or incomplete, missing fine-grained clinical details essential for accurate medical question answering. In this work, we propose SearchRAG, a novel framework that overcomes these limitations by leveraging real-time search engines. |
YUCHENG SHI et. al. | arxiv-cs.CL | 2025-02-18 |
| 751 | Clinical QA 2.0: Multi-Task Learning for Answer Extraction and Categorization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While transformer-based models such as BERT, BioBERT, and ClinicalBERT have demonstrated state-of-the-art performance in CQA, existing models lack the ability to categorize extracted answers, which is critical for structured retrieval, content filtering, and medical decision support. To address this limitation, we introduce a Multi-Task Learning (MTL) framework that jointly trains CQA models for both answer extraction and medical categorization. |
PRIYARANJAN PATTNAYAK et. al. | arxiv-cs.CL | 2025-02-18 |
| 752 | LM Agents for Coordinating Multi-User Information Gathering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces PeopleJoin, a benchmark for evaluating LM-mediated collaborative problem solving. |
Harsh Jhamtani; Jacob Andreas; Benjamin Van Durme; | arxiv-cs.CL | 2025-02-17 |
| 753 | Ontology-Guided Reverse Thinking Makes Large Language Models Stronger on Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As aresult, it is difficult to establish reasoning paths to the purpose, whichleads to information loss and redundancy. To address this issue, inspired byhuman reverse thinking, we propose Ontology-Guided Reverse Thinking (ORT), anovel framework that constructs reasoning paths from purposes back toconditions. |
RUNXUAN LIU et. al. | arxiv-cs.CL | 2025-02-17 |
| 754 | Open-Ended and Knowledge-Intensive Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through our proposed approach, we achieve a substantial 17.5% improvement in accuracy on multiple choice questions in the KnowIT VQA dataset, establishing new state-of-the-art performance levels. |
Md Zarif Ul Alam; Hamed Zamani; | arxiv-cs.IR | 2025-02-17 |
| 755 | QuOTE: Question-Oriented Text Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present QuOTE (Question-Oriented Text Embeddings), a novel enhancement to retrieval-augmented generation (RAG) systems, aimed at improving document representation for accurate and nuanced retrieval. |
Andrew Neeser; Kaylen Latimer; Aadyant Khatri; Chris Latimer; Naren Ramakrishnan; | arxiv-cs.IR | 2025-02-15 |
| 756 | NitiBench: A Comprehensive Study of LLM Framework Capabilities for Thai Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To support fair evaluation, we propose tailored multi-labelretrieval metrics and the use of an LLM-as-judge for coverage and contradictiondetection method. |
PAWITSAPAK AKARAJARADWONG et. al. | arxiv-cs.CL | 2025-02-15 |
| 757 | Post-training An LLM for RAG? Train on Self-Generated Demonstrations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a recipe for training RAG-enabled LLMs using self-generated demonstrations, thereby avoiding training on out-of-distribution text and integrating retrievals into the LLM responses. |
MATTHEW FINLAYSON et. al. | arxiv-cs.CL | 2025-02-14 |
| 758 | Evaluating The Meta- and Object-Level Reasoning of Large Language Models for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) excel in natural language tasks but still face challenges in Question Answering (QA) tasks requiring complex, multi-step reasoning. We outline the types of reasoning required in some of these tasks, and reframe them in terms of meta-level reasoning (akin to high-level strategic reasoning or planning) and object-level reasoning (embodied in lower-level tasks such as mathematical reasoning). |
Nick Ferguson; Liane Guillou; Alan Bundy; Kwabena Nuamah; | arxiv-cs.CL | 2025-02-14 |
| 759 | Abduction of Domain Relationships from Data for VQA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of visual question answering (VQA) where the image and query are represented by ASP programs that lack domain data. |
Al Mehdi Saadat Chowdhury; Paulo Shakarian; Gerardo I. Simari; | arxiv-cs.LO | 2025-02-13 |
| 760 | LP-LM: No Hallucinations in Question Answering with Logic Programming Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces LP-LM, a system that grounds answers to questions in known facts contained in a knowledge base (KB), facilitated through semantic parsing in Prolog, and always produces answers that are reliable. |
Katherine Wu; Yanhong A. Liu; | arxiv-cs.AI | 2025-02-13 |
| 761 | A Review on Persian Question Answering Systems: from Traditional to Modern Approaches Related Papers Related Patents Related Grants Related Venues Related Experts View |
Safoura Aghadavoud Jolfaei; Azadeh Mohebi; | Artif. Intell. Rev. | 2025-02-13 |
| 762 | SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces SQuARE (Sequential Question Answering Reasoning Engine), a novel prompting technique designed to improve reasoning through a self-interrogation paradigm. |
Daniel Fleischer; Moshe Berchansky; Gad Markovits; Moshe Wasserblat; | arxiv-cs.CL | 2025-02-13 |
| 763 | Neuro-Conceptual Artificial Intelligence: Integrating OPM with Deep Learning to Enhance Question Answering Quality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Neuro-Conceptual Artificial Intelligence (NCAI), a specialization of the neuro-symbolic AI approach that integrates conceptual modeling using Object-Process Methodology (OPM) ISO 19450:2024 with deep learning to enhance question-answering (QA) quality. |
Xin Kang; Veronika Shteingardt; Yuhan Wang; Dov Dori; | arxiv-cs.CL | 2025-02-12 |
| 764 | FoQA: A Faroese Question-Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present FoQA, a Faroese extractive question-answering (QA) dataset with 2,000 samples, created using a semi-automated approach combining Large Language Models (LLMs) and human validation. |
Annika Simonsen; Dan Saattrup Nielsen; Hafsteinn Einarsson; | arxiv-cs.CL | 2025-02-11 |
| 765 | On Mechanistic Circuits for Extractive Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models are increasingly used to process documents and facilitate question-answering on them. |
SAMYADEEP BASU et. al. | arxiv-cs.CL | 2025-02-11 |
| 766 | Intelligent Legal Assistant: An Interactive Clarification System for Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a legal question-answering system called Intelligent Legal Assistant, which interacts with users to precisely capture their needs. |
RUJING YAO et. al. | arxiv-cs.CL | 2025-02-11 |
| 767 | From Data to Decisions: Enterprise-Level Domain-Specific Graph Retrieval-Augmented Generation Systems for Advanced Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Retrieval-Augmented Generation (RAG) is aimed at improving the functionality of large language model (LLM) applications by incorporating specific data. This may include searching … |
Sandeep Varma; S. Shivam; Sarun Natarajan; Ankita Banerjee; Sourodeep Roy; | 2025 IEEE International Conference on Big Data and Smart … | 2025-02-09 |
| 768 | Multi-granular Training Strategies for Robust Multi-hop Reasoning Over Noisy and Heterogeneous Knowledge Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Adaptive Multi-source Knowledge-Oriented Reasoning (AMKOR), a generative framework that leverages large language models (LLMs) to dynamically fuse parametric and retrieved knowledge while exploring reasoning trajectories using probabilistic beam reasoning. |
Jackson Coleman; Isaiah Lawrence; Benjamin Turner; | arxiv-cs.CL | 2025-02-09 |
| 769 | ARR: Question Answering with Large Language Models Via Analyzing, Retrieving, and Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces ARR, an intuitive, effective, and general QA solving method that explicitly incorporates three key steps: analyzing the intent of the question, retrieving relevant information, and reasoning step by step. |
Yuwei Yin; Giuseppe Carenini; | arxiv-cs.CL | 2025-02-07 |
| 770 | The Role of Prosody in Spoken Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the role of prosody in Spoken Question Answering. |
Jie Chi; Maureen de Seyssel; Natalie Schluter; | arxiv-cs.CL | 2025-02-07 |
| 771 | LLMs to Support A Domain Specific Knowledge Assistant Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a custom approach to developing a domain specific knowledge assistant for sustainability reporting using the International Financial Reporting Standards (IFRS). |
Maria-Flavia Lovin; | arxiv-cs.CL | 2025-02-06 |
| 772 | Understanding and Supporting Formal Email Exchange By Answering AI-Generated Questions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although systems with Large Language Models (LLMs) were designed to simplify the email replying process, users still need to provide detailed prompts to obtain the expected output. Therefore, we proposed and evaluated an LLM-powered question-and-answer (QA)-based approach for users to reply to emails by answering a set of simple and short questions generated from the incoming email. |
YUSUKE MIURA et. al. | arxiv-cs.HC | 2025-02-06 |
| 773 | Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present QUestion-only replay with AttentionDistillation (QUAD), a novel approach for VQACL that leverages only past taskquestions for regularization. |
Imad Eddine Marouf; Enzo Tartaglione; Stephane Lathuiliere; Joost van de Weijer; | arxiv-cs.CV | 2025-02-06 |
| 774 | TerraQ: Spatiotemporal Question-Answering on Satellite Image Archives Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: TerraQ is a spatiotemporal question-answering engine for satellite image archives. It is a natural language processing system that is built to process requests for satellite … |
Sergios-Anestis Kefalidis; Konstantinos Plas; Manolis Koubarakis; | arxiv-cs.CV | 2025-02-06 |
| 775 | MedBioLM: Optimizing Medical and Biological QA with Fine-Tuned Large Language Models and Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MedBioLM, a domain-adapted biomedical question-answering model designed to enhance both short-form and long-form queries. |
Seonok Kim; | arxiv-cs.CL | 2025-02-05 |
| 776 | Adaptive Sparse Triple Convolutional Attention for Enhanced Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ronggui Wang; Hong Chen; Juan Yang; Lixia Xue; | Vis. Comput. | 2025-02-04 |
| 777 | AmaSQuAD: A Benchmark for Amharic Extractive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research presents a novel framework for translating extractive question-answering datasets into low-resource languages, as demonstrated by the creation of the AmaSQuAD dataset, a translation of SQuAD 2.0 into Amharic. |
Nebiyou Daniel Hailemariam; Blessed Guda; Tsegazeab Tefferi; | arxiv-cs.CL | 2025-02-04 |
| 778 | TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present TUMTraffic-VideoQA, a novel dataset and benchmark designed for spatio-temporal video understanding in complex roadside traffic scenarios. |
XINGCHENG ZHOU et. al. | arxiv-cs.CV | 2025-02-04 |
| 779 | SensorChat: Answering Qualitative and Quantitative Questions During Long-Term Multimodal Sensor Interactions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce SensorChat, the first end-to-endQA system designed for daily life monitoring using long-duration,high-frequency time series data. |
XIAOFAN YU et. al. | arxiv-cs.AI | 2025-02-04 |
| 780 | Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Geospatial Reasoning Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing large language models (LLMs) lack spatial computing capabilities and access to up-to-date, ubiquitous real-world geospatial data, while traditional geospatial systems fall short in interpreting natural language. To bridge this gap, we introduce Spatial-RAG, a Retrieval-Augmented Generation (RAG) framework designed for geospatial question answering. |
DAZHOU YU et. al. | arxiv-cs.IR | 2025-02-03 |
| 781 | Language Models Prefer What They Know: Relative Confidence Estimation Via Confidence Preferences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose relative confidence estimation, where we match up questions against each other and ask the model to make relative judgments of confidence (Which question are you more confident in answering correctly?) |
Vaishnavi Shrivastava; Ananya Kumar; Percy Liang; | arxiv-cs.CL | 2025-02-03 |
| 782 | ChartCitor: Multi-Agent Framework for Fine-Grained Chart Visual Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ChartCitor, a multi-agent framework that provides fine-grained bounding box citations by identifying supporting evidence within chart images. |
Kanika Goswami; Puneet Mathur; Ryan Rossi; Franck Dernoncourt; | arxiv-cs.CL | 2025-02-02 |
| 783 | Knowledge Graph Based Question-answering Model with Subgraph Retrieval Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View |
RUI ZHU et. al. | Comput. Oper. Res. | 2025-02-01 |
| 784 | RK-VQA: Rational Knowledge-aware Fusion-in-decoder for Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Weipeng Chen; Xu Huang; Zifeng Liu; Jin Liu; Lan You; | Inf. Fusion | 2025-02-01 |
| 785 | Diff-ZsVQA: Zero-shot Visual Question Answering with Frozen Large Language Models Using Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View |
QUANXING XU et. al. | Expert Syst. Appl. | 2025-02-01 |
| 786 | Multilingual State Space Models for Structured Question Answering in Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose enhancements to existing SSM frameworks, optimizing their applicability to low-resource settings and multilingual scenarios prevalent in Indic languages. |
Arpita Vats; Rahul Raja; Mrinal Mathur; Vinija Jain; Aman Chadha; | arxiv-cs.CL | 2025-02-01 |
| 787 | Robust Data Augmentation and Contrast Learning for Debiased Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ke Ning; Zhixin Li; | Neurocomputing | 2025-02-01 |
| 788 | Knowledge Augmented Expert Finding Framework Via Knowledge Graph Embedding for Community Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
YUE LIU et. al. | Eng. Appl. Artif. Intell. | 2025-02-01 |
| 789 | LLM Based QA Chatbot Builder: A Generative AI-based Chatbot Builder for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Md. Shahidul Salim; Sk Imran Hossain; Tanim Jalal; Dhiman Bose; Mohammad Jahid Ibna Basher; | SoftwareX | 2025-02-01 |
| 790 | ENVQA: Improving Visual Question Answering Model By Enriching The Visual Feature IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Souvik Chowdhury; Badal Soni; | Eng. Appl. Artif. Intell. | 2025-02-01 |
| 791 | Language-guided Bias Generation Contrastive Strategy for Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Visual question answering (VQA) is a challenging task that requires models to understand both visual and linguistic inputs and produce accurate answers. However, VQA models often … |
ENYUAN ZHAO et. al. | ACM Transactions on Multimedia Computing, Communications … | 2025-01-30 |
| 792 | CALM: Unleashing The Cross-Lingual Self-Aligning Ability of Language Model Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We offer a qualitative analysis of how cross-lingual consistency can enhance knowledge alignment and explore the method’s generalizability. |
Yumeng Wang; Zhiyuan Fan; Qingyun Wang; May Fung; Heng Ji; | arxiv-cs.CL | 2025-01-30 |
| 793 | Cross-Language Approach for Quranic QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these systems face unique challenges, including the linguistic disparity between questions written in Modern Standard Arabic and answers found in Quranic verses written in Classical Arabic, and the small size of existing datasets, which further restricts model performance. To address these challenges, we adopt a cross-language approach by (1) Dataset Augmentation: expanding and enriching the dataset through machine translation to convert Arabic questions into English, paraphrasing questions to create linguistic diversity, and retrieving answers from an English translation of the Quran to align with multilingual training requirements; and (2) Language Model Fine-Tuning: utilizing pre-trained models such as BERT-Medium, RoBERTa-Base, DeBERTa-v3-Base, ELECTRA-Large, Flan-T5, Bloom, and Falcon to address the specific requirements of Quranic QA. |
Islam Oshallah; Mohamed Basem; Ali Hamdi; Ammar Mohammed; | arxiv-cs.CL | 2025-01-29 |
| 794 | Hybrid Graphs for Table-and-Text Based Question Answering Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel Hybrid Graph-based approach for Table-Text QA that leverages LLMs without fine-tuning. |
Ankush Agarwal; Ganesh S; Chaitanya Devaguptapu; | arxiv-cs.CL | 2025-01-29 |
| 795 | PISCO: Pretty Simple Compression for Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce PISCO, a novel method that achieves a 16x compression rate with minimal accuracy loss (0-3%) across diverse RAG-based question-answering (QA) tasks. |
Maxime Louis; Hervé Déjean; Stéphane Clinchant; | arxiv-cs.CL | 2025-01-27 |
| 796 | Federated Retrieval Augmented Generation for Multi-Product Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MKP-QA, a novel multi-product knowledge-augmented QA framework with probabilistic federated search across domains and relevant knowledge. |
PARSHIN SHOJAEE et. al. | arxiv-cs.CL | 2025-01-24 |
| 797 | Unlocking Wisdom: Enhancing Biomedical Question Answering with Domain Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View |
Bita Azad; Mahdiyar Ali Akbar Alavi; Parastoo Jafarzadeh; F. Ensan; Dimitrios Androutsos; | Knowl. Inf. Syst. | 2025-01-23 |
| 798 | ReasVQA: Advancing VideoQA with Imperfect Reasoning Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce \textbf{ReasVQA} (Reasoning-enhanced Video Question Answering), a novel approach that leverages reasoning processes generated by Multimodal Large Language Models (MLLMs) to improve the performance of VideoQA models. |
JIANXIN LIANG et. al. | arxiv-cs.CV | 2025-01-23 |
| 799 | Question Answering on Patient Medical Records with Private Fine-Tuned LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, ensuring privacy and compliance requires edge and private deployments of LLMs. This paper proposes a novel approach to semantic QA over EHRs by first identifying the most relevant FHIR resources for a user query (Task1) and subsequently answering the query based on these resources (Task2). |
Sara Kothari; Ayush Gupta; | arxiv-cs.CL | 2025-01-23 |
| 800 | ENTER: Event Based Interpretable Reasoning for VideoQA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present ENTER, an interpretable Video Question Answering (VideoQA) system based on event graphs. |
HAMMAD AYYUBI et. al. | arxiv-cs.CV | 2025-01-23 |
| 801 | K-COMP: Retrieval-Augmented Medical Domain Question Answering With Knowledge-Injected Compressor Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As a result, the documents include some inaccurate information, which could lead the reader model to mistrust the passages and could result in hallucinations. To solve these problems, we propose K-comp (Knowledge-injected compressor) which provides the knowledge required to answer correctly. |
Jeonghun Cho; Gary Geunbae Lee; | arxiv-cs.CL | 2025-01-23 |
| 802 | Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, RAG is not always needed as may introduce irrelevant information. Recent adaptive retrieval methods integrate LLMs’ intrinsic knowledge with external information appealing to LLM self-knowledge, but they often neglect efficiency evaluations and comparisons with uncertainty estimation techniques. |
VIKTOR MOSKVORETSKII et. al. | arxiv-cs.CL | 2025-01-22 |
| 803 | Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel design to combine knowledge graph and LLMs for zero-shot visual question answer. |
Qian Tao; Xiaoyang Fan; Yong Xu; Xingquan Zhu; Yufei Tang; | arxiv-cs.CV | 2025-01-22 |
| 804 | Bidirectional Cascaded Multimodal Attention for Multiple Choice Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Sushmita Upadhyay; S. S. Tripathy; | Mach. Vis. Appl. | 2025-01-22 |
| 805 | QATCH: Automatic Evaluation of SQL-Centric Tasks on Proprietary Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Tabular Representation Learning (TRL) and Large Language Models (LLMs) have become established for tackling Question Answering (QA) and Semantic Parsing (SP) tasks on tabular … |
Simone Papicchio; Paolo Papotti; Luca Cagliero; | ACM Transactions on Intelligent Systems and Technology | 2025-01-20 |
| 806 | Question-to-Question Retrieval for Hallucination-Free Knowledge Access: An Approach for Wikipedia and Wikidata Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an approach to question answering over knowledge bases like Wikipedia and Wikidata by performing question-to-question matching and retrieval from a dense vector embedding store. |
Santhosh Thottingal; | arxiv-cs.CL | 2025-01-20 |
| 807 | A Collection of Question Answering Datasets for Norwegian Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a new suite of question answering datasets for Norwegian; NorOpenBookQA, NorCommonSenseQA, NorTruthfulQA, and NRK-Quiz-QA. |
Vladislav Mikhailov; Petter Mæhlum; Victoria Ovedie Chruickshank Langø; Erik Velldal; Lilja Øvrelid; | arxiv-cs.CL | 2025-01-19 |
| 808 | Leveraging Chain of Thought Towards Empathetic Spoken Dialogue Without Corresponding Question-Answering Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we propose a novel approach that circumvents the need for question-answering data, called Listen, Perceive, and Express (LPE). |
JINGRAN XIE et. al. | arxiv-cs.CL | 2025-01-18 |
| 809 | InsQABench: Benchmarking Chinese Insurance Domain Question Answering with Large Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The application of large language models (LLMs) has achieved remarkable success in various fields, but their effectiveness in specialized domains like the Chinese insurance … |
JING DING et. al. | arxiv-cs.CL | 2025-01-18 |
| 810 | SAFFNet: Self-attention Based on Fourier Frequency Domain Filter Network for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jingya Shi; Dezhi Han; Chongqing Chen; Xiang Shen; | Vis. Comput. | 2025-01-17 |
| 811 | Algorithm for Semantic Network Generation from Texts of Low Resource Languages Such As Kiswahili Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Processing low-resource languages, such as Kiswahili, using machine learning is difficult due to lack of adequate training data. However, such low-resource languages are still … |
Barack Wamkaya Wanjawa; Lawrence Muchemi; Evans Miriti; | arxiv-cs.CL | 2025-01-16 |
| 812 | Passage Segmentation of Documents for Extractive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study emphasizes the critical role of chunking in improving the performance of both dense passage retrieval and the end-to-end RAG pipeline. |
Zuhong Liu; Charles-Elie Simon; Fabien Caspani; | arxiv-cs.CL | 2025-01-16 |
| 813 | Admitting Ignorance Helps The Video Question Answering Models to Answer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We argue that thesemodels often establish shortcuts, resulting in spurious correlations betweenquestions and answers, especially when the alignment between video and textdata is suboptimal. To address these spurious correlations, we propose a noveltraining framework in which the model is compelled to acknowledge its ignorancewhen presented with an intervened question, rather than making guesses solelybased on superficial question-answer correlations. |
Haopeng Li; Tom Drummond; Mingming Gong; Mohammed Bennamoun; Qiuhong Ke; | arxiv-cs.CV | 2025-01-15 |
| 814 | To Retrieve or Not to Retrieve? Uncertainty Detection for Dynamic Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: by exploring multiple uncertainty detection methods. We evaluate these methods for the task of long-form question answering, employing dynamic retrieval, and present our comparisons. |
Kaustubh D. Dhole; | arxiv-cs.CL | 2025-01-15 |
| 815 | ASTRID – An Automated and Scalable TRIaD for The Evaluation of RAG-based Clinical Question Answering Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) have shown impressive potential in clinical question answering (QA), with Retrieval Augmented Generation (RAG) emerging as a leading approach for … |
Mohita Chowdhury; Yajie Vera He; A. Higham; Ernest Lim; | Annual Meeting of the Association for Computational … | 2025-01-14 |
| 816 | ASTRID — An Automated and Scalable TRIaD for The Evaluation of RAG-based Clinical Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using clinical human evaluations of responses isexpensive, unscalable, and not conducive to the continuous iterativedevelopment of RAG systems. To address these challenges, we introduce ASTRID -an Automated and Scalable TRIaD for evaluating clinical QA systems leveragingRAG – consisting of three metrics: Context Relevance (CR), Refusal Accuracy(RA), and Conversational Faithfulness (CF). |
Mohita Chowdhury; Yajie Vera He; Jared Joselowitz; Aisling Higham; Ernest Lim; | arxiv-cs.CL | 2025-01-14 |
| 817 | Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We conduct extensive real-data experiments to validate the effectiveness of our approach. |
Yinghao Hu; Leilei Gan; Wenyi Xiao; Kun Kuang; Fei Wu; | arxiv-cs.CL | 2025-01-11 |
| 818 | Top 2 at ALQAC 2024: Large Language Models (LLMs) for Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Huy Quang Pham; Quan Van Nguyen; Dan Quang Tran; Thang Kien-Bao Nguyen; Kiet Van Nguyen; | Int. J. Asian Lang. Process. | 2025-01-10 |
| 819 | SensorQA: A Question Answering Benchmark for Daily-Life Monitoring Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While existing research primarily focuses on learning classification models, fewer studies have explored how end users can actively extract useful insights from sensor data, often hindered by the lack of a proper dataset. To address this gap, we introduce SensorQA, the first human-created question-answering (QA) dataset for long-term time-series sensor data for daily life monitoring. |
BENJAMIN REICHMAN et. al. | arxiv-cs.CL | 2025-01-09 |
| 820 | TimelineKGQA: A Comprehensive Question-Answer Pair Generator for Temporal Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel categorization framework based on timeline-context relationships, along with \textbf{TimelineKGQA}, a universal temporal QA generator applicable to any TKGs. |
Qiang Sun; Sirui Li; Du Huynh; Mark Reynolds; Wei Liu; | arxiv-cs.LO | 2025-01-08 |
| 821 | Multimodal Multihop Source Retrieval for Web Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the importance of graph structure for multi-modal multi-hop question answering. |
Navya Yarrabelly; Saloni Mittal; | arxiv-cs.CL | 2025-01-07 |
| 822 | Multilingual Open QA on The MIA Shared Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: \par We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. |
Navya Yarrabelly; Saloni Mittal; Ketan Todi; Kimihiro Hasegawa; | arxiv-cs.CL | 2025-01-07 |
| 823 | BoundingDocs: A Unified Dataset for Document Question Answering with Spatial Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a unified dataset for document Question-Answering (QA), which is obtained combining several public datasets related to Document AI and visually rich document understanding (VRDU). |
Simone Giovannini; Fabio Coppini; Andrea Gemelli; Simone Marinai; | arxiv-cs.CL | 2025-01-06 |
| 824 | CoReQA: Uncovering Potentials of Language Models in Code Repository Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These benchmarks fail to capture the real-world complexity of software engineering and user requirements for understanding code repositories. To address this gap, we introduce CoReQA, a benchmark for Code Repository-level question answering, constructed from GitHub issues and comments from 176 popular repositories across four programming languages. |
JIALIANG CHEN et. al. | arxiv-cs.SE | 2025-01-06 |
| 825 | ReDiT: Re-evaluating Large Visual Question Answering Model Confidence By Defining Input Scenario Difficulty and Applying Temperature Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View |
Modafar Al-Shouha; Gábor Szücs; | Multim. Syst. | 2025-01-06 |
| 826 | QuIM-RAG: Advancing Retrieval-Augmented Generation with Inverted Question Matching for Enhanced QA Performance IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce QuIM-RAG (Question-to-question Inverted Index Matching), a novel approach for the retrieval mechanism in our system. |
Binita Saha; Utsha Saha; Muhammad Zubair Malik; | arxiv-cs.CL | 2025-01-05 |
| 827 | Indonesian Linguistic Ontology for Enhancing Ontology-Based Indonesian Question-Answering Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The development of Linguistic Ontology is essential to advance the Indonesian ontology-based question-answering (QA) system. This paper describes the development of Indonesian … |
Fadhila Tangguh Admojo; Adidah Lajis; Haidawati Nasir; | 2025 19th International Conference on Ubiquitous … | 2025-01-03 |
| 828 | QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce QuArch, a dataset of 1500 human-validated question-answer pairs designed to evaluate and enhance language models’ understanding of computer architecture. |
SHVETANK PRAKASH et. al. | arxiv-cs.AR | 2025-01-03 |
| 829 | MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches exhibit inherent limitations: specialized models excel in capturing domain-specific details but lack generalization, while vision-language models (VLMs) built on large language models (LLMs) leverage general knowledge but struggle with domain-specific adaptation. To address these limitations, this paper proposes a novel agent-enhanced model collaboration framework, which we call MoColl, designed to effectively integrate domain-specific and general knowledge. |
Pu Yang; Bin Dong; | arxiv-cs.CV | 2025-01-03 |
| 830 | Citations and Trust in LLM Generated Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explored trust through an anti-monitoring framework, where trust is predicted to be correlated with presence of citations and inversely related to checking citations. |
YIFAN DING et. al. | arxiv-cs.CL | 2025-01-02 |
| 831 | HetGCoT: Heterogeneous Graph-Enhanced Chain-of-Thought LLM Reasoning for Academic Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose HetGCoT, a framework enabling LLMs to effectively leverage and learn information from graphs to reason interpretable academic QA results. |
Runsong Jia; Mengjia Wu; Ying Ding; Jie Lu; Yi Zhang; | arxiv-cs.SI | 2025-01-02 |
| 832 | CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Still, these models can make distinctly unnatural errors, for example, providing (wrong) answers to unanswerable VQA questions, such as questions asking about objects that do not appear in the image. To address this issue, we propose CLIP-UP: CLIP-based Unanswerable Problem detection, a novel lightweight method for equipping VLMs with the ability to withhold answers to unanswerable questions. |
Ben Vardi; Oron Nir; Ariel Shamir; | arxiv-cs.CV | 2025-01-02 |
| 833 | Frequency Domain Transfer Learning for Remote Sensing Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
ZE ZHANG et. al. | Expert Syst. Appl. | 2025-01-01 |
| 834 | Visual Question Answering in Robotic Surgery: A Comprehensive Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Visual Question Answering (VQA) in robotic surgery is rapidly becoming a pivotal technology in medical AI, addressing the complex challenge of interpreting multimodal surgical … |
Di Ding; Tianliang Yao; Rong Luo; Xusen Sun; | IEEE Access | 2025-01-01 |
| 835 | Disambiguation in Conversational Question Answering in The Era of LLM: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View |
MD. MEHRAB TANJIM et. al. | ArXiv | 2025-01-01 |
| 836 | Maximizing Discrimination Masking for Faithful Question Answering with Machine Reading Related Papers Related Patents Related Grants Related Venues Related Experts View |
Dong Li; Jintao Tang; Pancheng Wang; Shasha Li; Ting Wang; | Inf. Process. Manag. | 2025-01-01 |
| 837 | DeBERTA-Att-LMCQA: A Hybrid Model of DeBERTA and Attention for Legal Multi-choice Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ying Luo; Xudong Luo; Guibin Chen; | Expert Syst. Appl. | 2025-01-01 |
| 838 | BVQA: Connecting Language and Vision Through Multimodal Attention for Open-Ended Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Visual Question Answering (VQA) is a challenging problem of Artificial Intelligence (AI) that requires an understanding of natural language and computer vision to respond to … |
Md. Shalha Mucha Bhuyan; Eftekhar Hossain; Khaleda Akhter Sathi; Md. Azad Hossain; M. A. A. Dewan; | IEEE Access | 2025-01-01 |
| 839 | Alignment-Guided Self-Supervised Learning for Diagram Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Diagram question answering (DQA), which is defined as answering natural language questions according to the visual diagram context, has attracted attention and has recently become … |
SHAOWEI WANG et. al. | IEEE Transactions on Multimedia | 2025-01-01 |
| 840 | Time-aware ReAct Agent for Temporal Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
QianyiHu QianyiHu; Xinhui Tu; Guo Cong; Shunping Zhang; | North American Chapter of the Association for Computational … | 2025-01-01 |
| 841 | GRI-QA: A Comprehensive Benchmark for Table Question Answering Over Environmental Data Related Papers Related Patents Related Grants Related Venues Related Experts View |
M. CONTALBO et. al. | Annual Meeting of the Association for Computational … | 2025-01-01 |
| 842 | BioASQ at CLEF2025: The Thirteenth Edition of The Large-Scale Biomedical Semantic Indexing and Question Answering Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View |
A. NENTIDIS et. al. | European Conference on Information Retrieval | 2025-01-01 |
| 843 | HTML: Hierarchical Topology Multi-task Learning for Semantic Parsing in Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
A. WULAMU et. al. | Annual Meeting of the Association for Computational … | 2025-01-01 |
| 844 | Knowledge Graphs As A Source of Trust for LLM-powered Enterprise Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Juan Sequeda; D. Allemang; Bryon Jacob; | J. Web Semant. | 2025-01-01 |
| 845 | DynaQuest: A Dynamic Question Answering Dataset Reflecting Real-World Knowledge Updates Related Papers Related Patents Related Grants Related Venues Related Experts View |
Qian Lin; Junyi Li; Hwee Tou Ng; | Annual Meeting of the Association for Computational … | 2025-01-01 |
| 846 | Advancing Singlish Understanding: Bridging The Gap with Datasets and Multimodal Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Singlish, a Creole language rooted in English, is a key focus in linguistic research within multilingual and multicultural contexts. However, its spoken form remains … |
BIN WANG et. al. | arxiv-cs.CL | 2025-01-01 |
| 847 | It’s High Time: A Survey of Temporal Information Retrieval and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Bhawna Piryani; Abdelrahman Abdallah; Jamshid Mozafari; Avishek Anand; Adam Jatowt; | ArXiv | 2025-01-01 |
| 848 | MedThink: A Rationale-Guided Framework for Explaining Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
XIAOTANG GAI et. al. | North American Chapter of the Association for Computational … | 2025-01-01 |
| 849 | NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View |
PAWITSAPAK AKARAJARADWONG et. al. | ArXiv | 2025-01-01 |
| 850 | ZPVQA: Visual Question Answering of Images Based on Zero-Shot Prompt Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, the use of zero-shot learning to solve visual question-answering (VQA) problems has become a common strategy to address the challenges of complex interactions … |
Naihao Hu; Xiaodan Zhang; Qiyuan Zhang; Wei Huo; Shaojie You; | IEEE Access | 2025-01-01 |
| 851 | LININ: Logic Integrated Neural Inference Network for Explanatory Visual Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Explanatory Visual Question Answering (EVQA) is a recently proposed multimodal reasoning task consisting of answering the visual question and generating multimodal explanations … |
Dizhan Xue; Shengsheng Qian; Quan Fang; Changsheng Xu; | IEEE Transactions on Multimedia | 2025-01-01 |
| 852 | R-VQA: A Robust Visual Question Answering Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Souvik Chowdhury; Badal Soni; | Knowl. Based Syst. | 2025-01-01 |
| 853 | KoSEL: Knowledge Subgraph Enhanced Large Language Model for Medical Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
ZEFAN ZENG et. al. | Knowl. Based Syst. | 2025-01-01 |
| 854 | MiniMedGPT: Efficient Large Vision-Language Model for Medical Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
ABDEL RAHMAN ALSABBAGH et. al. | Pattern Recognit. Lett. | 2025-01-01 |
| 855 | LLM-MedQA: Enhancing Medical Question Answering Through Case Studies in Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These limitations undermine their effectiveness in critical medical applications. To address these issues, we propose a novel approach incorporating similar case generation within a multi-agent medical question-answering (MedQA) system. |
HANG YANG et. al. | arxiv-cs.CL | 2024-12-31 |
| 856 | MapQaTor: An Extensible Framework for Efficient Annotation of Map-Based QA Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce MapQaTor, an extensible open-source framework that streamlines the creation of reproducible, traceable map-based QA datasets. |
Mahir Labib Dihan; Mohammed Eunus Ali; Md Rizwan Parvez; | arxiv-cs.CL | 2024-12-30 |
| 857 | An Empirical Evaluation of Large Language Models on Consumer Health Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of several Large Language Models (LLMs) on MedRedQA, a dataset of consumer-based medical questions and answers by verified experts extracted from the AskDocs subreddit. |
Moaiz Abrar; Yusuf Sermet; Ibrahim Demir; | arxiv-cs.CL | 2024-12-30 |
| 858 | Audiopedia: Audio QA with Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Audiopedia, a novel task called Audio Question Answering with Knowledge, which requires both audio comprehension and external knowledge reasoning. |
Abhirama Subramanyam Penamakuri; Kiran Chhatre; Akshat Jain; | arxiv-cs.LG | 2024-12-29 |
| 859 | Building A Rich Dataset to Empower The Persian Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, a comprehensive open-domain dataset is presented for Persian. |
Mohsen Yazdinejad; Marjan Kaedi; | arxiv-cs.CL | 2024-12-28 |
| 860 | Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a three-stage (\underline{p}re-training, \underline{f}ine-tuning and \underline{r}e-ranking) framework for \underline{l}egal \underline{QA} (called PFR-LQA), which promotes the fine-grained text representation learning and boosts the performance of dense retrieval with the dual-encoder architecture. |
Shiwen Ni; Hao Cheng; Min Yang; | arxiv-cs.CL | 2024-12-27 |
| 861 | Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate diverse temporal modeling techniques to integrate with MLLMs, aiming to achieve question-guided temporal modeling that leverages pre-trained visual and textual alignment in MLLMs. |
ROBERTO AMOROSO et. al. | arxiv-cs.CV | 2024-12-26 |
| 862 | Unlocking The Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the capability of state-of-the-art language models-RoBERTa Base, Bangla-BERT, and BERT Base-in automatically assessing Bangla passage-based question-answering from the National Curriculum and Textbook Board (NCTB) textbooks for classes 6-10. |
Abdullah Khondoker; Enam Ahmed Taufik; Md Iftekhar Islam Tashik; S M Ishtiak mahmud; Antara Firoz Parsa; | arxiv-cs.CL | 2024-12-24 |
| 863 | HAUR: Human Annotation Understanding and Recognition Through Text-Heavy Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models exhibit significant limitations in understanding human annotations on text-heavy images. To address this, we propose the Human Annotation Understanding and Recognition (HAUR) task. |
Yuchen Yang; Haoran Yan; Yanhao Chen; Qingqiang Wu; Qingqi Hong; | arxiv-cs.CV | 2024-12-24 |
| 864 | Multi-Agents Based on Large Language Models for Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have achieved impressive results inknowledge-based Visual Question Answering (VQA). |
Zhongjian Hu; Peng Yang; Bing Li; Zhenqi Wang; | arxiv-cs.CL | 2024-12-24 |
| 865 | VidCtx: Context-aware Video Question Answering with Image Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address those shortcomings, in this paper, we introduce VidCtx, a novel training-free VideoQA framework which integrates both modalities, i.e. both visual information from input frames and textual descriptions of others frames that give the appropriate context. |
Andreas Goulas; Vasileios Mezaris; Ioannis Patras; | arxiv-cs.CV | 2024-12-23 |
| 866 | Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate the factual accuracy and citation performance of state-of-the-art LLMs on the task of Question Answering (QA) in ambiguous settings with source citations. |
Maya Patel; Aditi Anand; | arxiv-cs.CL | 2024-12-23 |
| 867 | Rationale-guided Prompting for Knowledge-based Visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework called PLRH thatPrompts LLMs with Rationale Heuristics for knowledge-based VQA. |
Zhongjian Hu; Peng Yang; Bing Li; Fengyuan Liu; | arxiv-cs.CL | 2024-12-22 |
| 868 | SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, the quality of language models depends on reasoning and prompting techniques, such as COT, which remain underexplored when using speech instructions. To address these challenges, we propose SilVar, a novel end-to-end multimodal model that uses speech instructions for reasoning in visual question answering. |
Tan-Hanh Pham; Hoang-Nam Le; Phu-Vinh Nguyen; Chris Ngo; Truong-Son Hy; | arxiv-cs.CV | 2024-12-21 |
| 869 | DragonVerseQA: Open-Domain Long-Form Context-Aware Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel approach to develop an open-domain and long-form Over-The-Top (OTT) Question-Answering (QA) dataset, DragonVerseQA, specifically oriented to the fantasy universe of House of the Dragon and Game Of Thrones TV series. |
Aritra Kumar Lahiri; Qinmin Vivian Hu; | arxiv-cs.CL | 2024-12-21 |
| 870 | Automated CVE Analysis: Harnessing Machine Learning In Designing Question-Answering Models For Cybersecurity Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle these challenges, it is necessary to develop a cybersecurity-specific dataset and train a machine learning model on it, aimed at enhancing the understanding and retrieval of domain-specific information. This paper presents a novel dataset and describes a machine learning model trained on this dataset for the QA task. |
Tanjim Bin Faruk; | arxiv-cs.CR | 2024-12-20 |
| 871 | MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To systematically study time-sensitive question answering, we introduce the TempRAGEval benchmark, which repurposes existing datasets by incorporating temporal perturbations and gold evidence labels. |
ZHANG SIYUE et. al. | arxiv-cs.CL | 2024-12-19 |
| 872 | Review-Then-Refine: A Dynamic Framework for Multi-Hop Question Answering with Temporal Adaptability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the challenge, this paper proposes a novel framework called review-then-refine, which aims to enhance LLM performance in multi-hop QA scenarios with temporal information. |
Xiangsen Chen; Xuming Hu; Nan Tang; | arxiv-cs.CL | 2024-12-19 |
| 873 | CodeRepoQA: A Large-scale Benchmark for Software Engineering Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce CodeRepoQA, a large-scale benchmark specifically designed for evaluating repository-level question-answering capabilities in the field of software engineering. |
RUIDA HU et. al. | arxiv-cs.SE | 2024-12-19 |
| 874 | Advancing Educational Management with The ATT-MR-WL Intelligent Question-answering Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Higher education plays a critical role in cultivating talent, preserving culture, and promoting social progress. However, current challenges, such as inefficient information … |
Ying Ba; | J. Web Eng. | 2024-12-19 |
| 875 | GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This problem remains challenging in robotics, due to thedifficulties in obtaining useful semantic representations, updating theserepresentations online, and leveraging prior world knowledge for efficientplanning and exploration. To address these limitations, we propose GraphEQA, anovel approach that utilizes real-time 3D metric-semantic scene graphs (3DSGs)and task relevant images as multi-modal memory for grounding Vision-LanguageModels (VLMs) to perform EQA tasks in unseen environments. |
SAUMYA SAXENA et. al. | arxiv-cs.RO | 2024-12-18 |
| 876 | Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing debiasing methods such as Translation Chain-of-Thought or Retrieval-augmented generation (RAG) by themselves fall short of closing this performance gap, often failing to improve performance across all languages and lacking specificity for the medical domain. To address this issue, We propose CLARA (Cross-Lingual Reflective Agentic system), a novel inference time de-biasing method leveraging retrieval augmented generation and self-verification. |
DAVID RESTREPO et. al. | arxiv-cs.CL | 2024-12-18 |
| 877 | Question: How Do Large Language Models Perform on The Question Answering Tasks? Answer: Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a comprehensive performance comparison between smaller fine-tuned models and out-of-the-box instruction-following LLMs on the Stanford Question Answering Dataset 2.0 (SQuAD2), specifically when using a single-inference prompting technique. |
Kevin Fischer; Darren Fürst; Sebastian Steindl; Jakob Lindner; Ulrich Schäfer; | arxiv-cs.CL | 2024-12-17 |
| 878 | EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce EXIT, an extractive context compression framework that enhances both the effectiveness and efficiency of retrieval-augmented generation (RAG) in question answering (QA). |
TAEHO HWANG et. al. | arxiv-cs.CL | 2024-12-17 |
| 879 | LLM-based Discriminative Reasoning for Knowledge Graph Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LLMs often produce ungrounded subgraph planning or reasoning results in KGQA due to the hallucinatory behavior brought by the generative paradigm. To tackle this issue, we propose READS to reformulate the KGQA process into discriminative subtasks, which simplifies the search space for each subtasks. |
MUFAN XU et. al. | arxiv-cs.CL | 2024-12-17 |
| 880 | Interpretable LLM-based Table Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although recent approaches using Large Language Models (LLMs) have significantly improved Table QA performance, their explanations for how the answers are generated are ambiguous. To fill this gap, we introduce Plan-of-SQLs (POS), an interpretable Table QA approach designed to improve users’ understanding of model decision-making. |
GIANG NGUYEN et. al. | arxiv-cs.CL | 2024-12-16 |
| 881 | Context Filtering with Reward Modeling in Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, the mix of relevant and irrelevant information in these contexts can hinder performance enhancements in QA tasks. To address this, we introduce a context filtering approach that removes non-essential details, summarizing crucial content through Reward Modeling. |
Sangryul Kim; James Thorne; | arxiv-cs.CL | 2024-12-16 |
| 882 | SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, current SQA datasets have limited reasoning types and neglect the relevance between tables and text, creating a significant gap with real scenarios. To address these challenges, we propose a QA benchmark for scientific tables and text with diverse reasoning types (SciTaT). |
XUANLIANG ZHANG et. al. | arxiv-cs.CL | 2024-12-16 |
| 883 | CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, because of the inherent limitation of MCQ-based evaluation and the increasing reasoning ability of MLLMs, models can give the current answer purely by combining short video understanding with elimination, without genuinely understanding the video content. To address this gap, we introduce CG-Bench, a novel benchmark designed for clue-grounded question answering in long videos. |
GUO CHEN et. al. | arxiv-cs.CV | 2024-12-16 |
| 884 | Advancements and Challenges in Bangla Question Answering Models: A Comprehensive Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The domain of Natural Language Processing (NLP) has experienced notable progress in the evolution of Bangla Question Answering (QA) systems. This paper presents a comprehensive review of seven research articles that contribute to the progress in this domain. |
Md Iftekhar Islam Tashik; Abdullah Khondoker; Enam Ahmed Taufik; Antara Firoz Parsa; S M Ishtiak Mahmud; | arxiv-cs.CL | 2024-12-16 |
| 885 | Precise Length Control in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a method to adapt pre-trained decoder-only LLMs for precise control of response length. |
Bradley Butcher; Michael O’Keefe; James Titchener; | arxiv-cs.CL | 2024-12-16 |
| 886 | MediRAG: Secure Question Answering for Healthcare Data Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Retrieval augmented generation (RAG) allows large language models to answer domain-specific questions by using external knowledge bases without training on this domain data or … |
Emily Jiang; Alice Chen; Irene Tenison; Lalana Kagal; | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 887 | Overview of TREC 2024 Medical Video Question Answering (MedVidQA) Track Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With increasing interest in AI to support clinical decision-making and improve patient engagement, there is a need to explore such challenges and develop efficient algorithms for medical language-video understanding and generation. Toward this, we introduced new tasks to foster research toward designing systems that can understand medical videos to provide visual answers to natural language questions, and are equipped with multimodal capability to generate instruction steps from the medical video. |
Deepak Gupta; Dina Demner-Fushman; | arxiv-cs.CV | 2024-12-15 |
| 888 | Evidence Contextualization and Counterfactual Attribution for Conversational QA Over Heterogeneous Data with RAG Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, several RAG systems today suffer from two shortcomings: (i) retrieved passages usually contain their raw text and lack appropriate document context, negatively impacting both retrieval and answering quality; and (ii) attribution strategies that explain answer generation typically rely only on similarity between the answer and the retrieved passages, thereby only generating plausible but not causal explanations. In this work, we demonstrate RAGONITE, a RAG system that remedies the above concerns by: (i) contextualizing evidence with source metadata and surrounding text; and (ii) computing counterfactual attribution, a causal explanation approach where the contribution of an evidence to an answer is determined by the similarity of the original response to the answer obtained by removing that evidence. |
RISHIRAJ SAHA ROY et. al. | arxiv-cs.CL | 2024-12-13 |
| 889 | Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Local-Global Question Aware Video Embedding (LGQAVE), which incorporates three major innovations to integrate multi-modal knowledge better and emphasize semantic visual concepts relevant to specific questions. |
SAI BHARGAV RONGALI et. al. | arxiv-cs.CV | 2024-12-12 |
| 890 | KnowShiftQA: How Robust Are RAG Systems When Textbook Knowledge Shifts in K-12 Education? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To systematically investigate RAG system robustness against suchknowledge discrepancies, we introduce KnowShiftQA. |
Tianshi Zheng; Weihan Li; Jiaxin Bai; Weiqi Wang; Yangqiu Song; | arxiv-cs.CL | 2024-12-12 |
| 891 | Generating SPARQL Queries Over CIDOC-CRM Using A Two-Stage Ontology Path Patterns Method in LLM Prompts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this article, we focus on the task of exploiting the capabilities of Large Language Models (LLMs) to generate SPARQL Queries for answering natural questions over cultural … |
M. Mountantonakis; Yannis Tzitzikas; | ACM Journal on Computing and Cultural Heritage | 2024-12-11 |
| 892 | Discrete Subgraph Sampling for Interpretable Graph Based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we integrate different discrete subset sampling methods into a graph-based visual question answering system to compare their effectiveness in generating interpretable explanatory subgraphs intrinsically. |
Pascal Tilli; Ngoc Thang Vu; | arxiv-cs.CL | 2024-12-11 |
| 893 | Piece of Table: A Divide-and-Conquer Approach for Selecting Subtables in Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, when applying linearized tables to LMs, the maximum token lengths often imposed in self-attention calculations make it difficult to comprehensively understand the context spread across large tables. To address these challenges, we present PieTa (Piece of Table), a new framework for subtable-based question answering (QA). |
Wonjin Lee; Kyumin Kim; Sungjae Lee; Jihun Lee; Kwang In Kim; | arxiv-cs.CL | 2024-12-10 |
| 894 | Ontology-Aware RAG for Improved Question-Answering in Cybersecurity Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Integrating AI into education has the potential to transform the teaching of science and technology courses, particularly in the field of cybersecurity. AI-driven … |
CHENGSHUAI ZHAO et. al. | ArXiv | 2024-12-10 |
| 895 | RAG-based Question Answering Over Heterogeneous Data and Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article presents the QUASAR system for question answering over unstructured text, structured tables, and knowledge graphs, with unified treatment of all sources. |
Philipp Christmann; Gerhard Weikum; | arxiv-cs.CL | 2024-12-10 |
| 896 | FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite advances in visual question answering,this multihop setting remains underexplored due to a lack of quality datasets.Existing methods focus on single-hop, single-modality, or short texts, limitingreal-world applications like interpreting educational documents with long,multimodal content. To fill this gap, we introduce FM2DS, the first frameworkfor creating a high-quality dataset for MMQA. |
Amirhossein Abaskohi; Spandana Gella; Giuseppe Carenini; Issam H. Laradji; | arxiv-cs.CL | 2024-12-09 |
| 897 | PediaBench: A Comprehensive Chinese Pediatric Dataset for Benchmarking Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through an in-depth analysis of experimental results, we offer insights into the ability of LLMs to answer pediatric questions in the Chinese context, highlighting their limitations for further improvements. |
QIAN ZHANG et. al. | arxiv-cs.CL | 2024-12-09 |
| 898 | An Entailment Tree Generation Approach for Multimodal Multi-Hop Question Answering with Mixture-of-Experts and Iterative Feedback Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 2) The reasoning process without interpretable reasoning steps makes the model difficult to discover the logical errors for handling complex questions. To solve these problems, we propose a unified LLMs-based approach but without heavily relying on them due to the LLM’s potential errors, and innovatively treat multimodal multi-hop question answering as a joint entailment tree generation and question answering problem. |
QING ZHANG et. al. | arxiv-cs.CL | 2024-12-08 |
| 899 | It Takes A Team to Triumph: Collaborative Expert Finding in Community QA Networks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The increasing complexity and multidisciplinary nature of queries on Community Question Answering (CQA) platforms have rendered the traditional model of individual expert response … |
ROOHOLLAH ETEMADI et. al. | Proceedings of the 2024 Annual International ACM SIGIR … | 2024-12-08 |
| 900 | Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph Based Question-Answering Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We believe that an effective T2I evaluation metric should accomplish the following: detect instances where the generated images do not align with the textual prompts, a discrepancy we define as the `hallucination problem’ in T2I tasks; record the types and frequency of hallucination issues, aiding users in understanding the causes of errors; and provide a comprehensive and intuitive scoring that close to human standard. To achieve these objectives, we propose a method based on large language models (LLMs) for conducting question-answering with an extracted scene-graph and created a dataset with human-rated scores for generated images. |
ZIYUAN QIN et. al. | arxiv-cs.CV | 2024-12-07 |
| 901 | Knowledge Graphs Are All You Need: Leveraging KGs in Physics Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a pipeline aimed at enhancing model response quality for Question Answering tasks. |
KRISHNASAI ADDALA et. al. | arxiv-cs.CL | 2024-12-06 |
| 902 | SLA Management in Reconfigurable Multi-Agent RAG: A Systems Approach to Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a systems-oriented approach to multi-agent RAG tailored for real-world Question Answering (QA) applications. |
Michael Iannelli; Sneha Kuchipudi; Vera Dvorak; | arxiv-cs.SE | 2024-12-06 |
| 903 | SplaXBERT: Leveraging Mixed Precision Training and Context Splitting for Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: SplaXBERT, built on ALBERT-xlarge with context-splitting and mixed precision training, achieves high efficiency in question-answering tasks on lengthy texts. Tested on SQuAD v1.1, … |
Zhu Yufan; Hao Zeyu; Li Siqi; Niu Boqian; | arxiv-cs.CL | 2024-12-06 |
| 904 | GRAF: Graph Retrieval Augmented By Facts for Romanian Legal Multi-Choice Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We first introduce JuRO, the first openly available Romanian legal MCQA dataset, comprising three different examinations and a number of 10,836 total questions. Along with this dataset, we introduce CROL, an organized corpus of laws that has a total of 93 distinct documents with their modifications from 763 time spans, that we leveraged in this work for Information Retrieval (IR) techniques. |
Cristian-George Crăciun; Răzvan-Alexandru Smădu; Dumitru-Clementin Cercel; Mihaela-Claudia Cercel; | arxiv-cs.CL | 2024-12-05 |
| 905 | Prompt Engineering Guidance for Conceptual Agent-based Model Extraction Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This document contains detailed information about the prompts used in the experimental process discussed in the paper Toward Automating Agent-based Model Generation: A Benchmark for Model Extraction using Question-Answering Techniques. |
Siamak Khatami; Christopher Frantz; | arxiv-cs.MA | 2024-12-05 |
| 906 | Question Answering for Decisionmaking in Green Building Design: A Multimodal Data Reasoning Method Driven By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on previous research, this study innovatively integrates large language models with DGBD, creating GreenQA, a question answering framework for multimodal data reasoning. |
Yihui Li; Xiaoyue Yan; Hao Zhou; Borong Lin; | arxiv-cs.AI | 2024-12-05 |
| 907 | Give Me Some Hard Questions: Synthetic Data Generation for Clinical QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that naive prompting often results in easy questions that do not reflect the complexity of clinical scenarios. To address this, we propose two prompting strategies: 1) instructing the model to generate questions that do not overlap with the input context, and 2) summarizing the input record using a predefined schema to scaffold question generation. |
FAN BAI et. al. | arxiv-cs.CL | 2024-12-05 |
| 908 | Domain-specific Question Answering with Hybrid Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that a hybrid approach combining a fine-tuned dense retriever with keyword based sparse search methods significantly enhances performance. |
DEWANG SULTANIA et. al. | arxiv-cs.CL | 2024-12-04 |
| 909 | Chinese Elderly Healthcare-Oriented Conversation: CareQA Dataset and Its Knowledge Distillation Based Generation Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The increasing global aging brings the substantial demand for healthcare knowledge among the elderly. Large Language Models (LLMs) based Conversation Agents (CAs) hold significant … |
He Xiao; Xingjiao Wu; Jialiang Tong; Bangyan Li; Yuling Sun; | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 910 | A Novel RAG Framework with Knowledge-Enhancement for Biomedical Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The biomedical question-answering system usually provide accurate and real-time responses, which is crucial for clinical decision-making and scientific research. Although large … |
Yongping Du; Zikai Wang; Binrui Wang; Xingnan Jin; Yu Pei; | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 911 | A Novel Knowledge Enhanced Large Language Model Augmented Framework for Medical Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Leveraging domain-specific knowledge from pre-trained large language models and knowledge graphs for reasoning in the medical question answering task has emerged as a prominent … |
Haochen Zou; Yongli Wang; | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 912 | Copy-Move Forgery Detection and Question Answering for Remote Sensing Image Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Driven by practical demands in land resource monitoring and national defense security, this paper introduces the Remote Sensing Copy-Move Question Answering (RSCMQA) task. |
ZE ZHANG et. al. | arxiv-cs.CV | 2024-12-03 |
| 913 | QA-TOOLBOX: Conversational Question-Answering for Process Task Guidance in Manufacturing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we explore utilizing LLMs for data augmentation for manufacturing task guidance system. |
RAMESH MANUVINAKURIKE et. al. | arxiv-cs.CL | 2024-12-03 |
| 914 | An Evolutionary Large Language Model for Hallucination Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose EvoLLMs, an innovative framework inspired by Evolutionary Computation, which automates the generation of high-quality Question-answering (QA) datasets while minimizing hallucinations. |
Abdennour Boulesnane; Abdelhakim Souilah; | arxiv-cs.CL | 2024-12-03 |
| 915 | Hybrid-SQuAD: Hybrid Scholarly Question Answering Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, scholarly information often spans heterogeneous sources, necessitating the development of QA systems that integrate information from multiple heterogeneous data sources. To address this challenge, we introduce Hybrid-SQuAD (Hybrid Scholarly Question Answering Dataset), a novel large-scale QA dataset designed to facilitate answering questions incorporating both text and KG facts. |
Tilahun Abedissa Taffa; Debayan Banerjee; Yaregal Assabie; Ricardo Usbeck; | arxiv-cs.CL | 2024-12-03 |
| 916 | GraphOTTER: Evolving LLM-based Graph Reasoning for Complex Table Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose GraphOTTER that explicitly establishes the reasoning process to pinpoint the correct answers. |
QIANLONG LI et. al. | arxiv-cs.CL | 2024-12-02 |
| 917 | Eyes on The Road: State-of-the-Art Video Question Answering Models Assessment for Traffic Monitoring Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The framework leverages GPT-4o to assess accuracy, relevance, and consistency across basic detection, temporal reasoning, and decomposition queries. |
Joseph Raj Vishal; Divesh Basina; Aarya Choudhary; Bharatesh Chakravarthi; | arxiv-cs.CV | 2024-12-02 |
| 918 | A Lightweight Transformer-based Visual Question Answering Network with Weight-Sharing Hybrid Attention Related Papers Related Patents Related Grants Related Venues Related Experts View |
Yue Zhu; Dongyue Chen; Tong Jia; Shizhuo Deng; | Neurocomputing | 2024-12-01 |
| 919 | GS-CBR-KBQA: Graph-structured Case-based Reasoning for Knowledge Base Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jiecheng Li; Xudong Luo; Guangquan Lu; | Expert Syst. Appl. | 2024-12-01 |
| 920 | Different Paths to The Same Destination: Diversifying LLMs Generation for Multi-hop Open-domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Ronghan Li; Yu Wang; Zijian Wen; Mingze Cui; Qiguang Miao; | Knowl. Based Syst. | 2024-12-01 |
| 921 | Generative Language Models Potential for Requirement Engineering Applications: Insights Into Current Strengths and Limitations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional language models have been extensively evaluated for software engineering domain, however the potential of ChatGPT and Gemini have not been fully explored. To fulfill this gap, the paper in hand presents a comprehensive case study to investigate the potential of both language models for development of diverse types of requirement engineering applications. |
Summra Saleem; Muhammad Nabeel Asim; Ludger Van Elst; Andreas Dengel; | arxiv-cs.SE | 2024-12-01 |
| 922 | Automated Construction Safety Reporting System Integrating Deep Learning-based Real-time Advanced Detection and Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Shihao Wen; Minsoo Park; D. Tran; Seungsoo Lee; Seunghee Park; | Adv. Eng. Softw. | 2024-12-01 |
| 923 | HintMiner: Automatic Question Hints Mining From Q&A Web Posts with Language Model Via Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Question – Answering (QA) forums such as Stack Overflow cannot always respond to the questions timely and properly. In this paper, we propose HintMiner, a novel automatic question hints mining tool for users to help them find answers. |
Zhenyu Zhang; JiuDong Yang; | aistats | 2024-12-01 |
| 924 | Overcoming Language Priors in Visual Question Answering with Cumulative Learning Strategy IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Aihua Mao; Feng Chen; Ziying Ma; Ken Lin; | Neurocomputing | 2024-12-01 |
| 925 | DynRank: Improving Passage Retrieval with Dynamic Zero-Shot Prompting Based on Question Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents DynRank, a novel framework for enhancing passage retrieval in open-domain question-answering systems through dynamic zero-shot question classification. |
Abdelrahman Abdallah; Jamshid Mozafari; Bhawna Piryani; Mohammed M. Abdelgwad; Adam Jatowt; | arxiv-cs.CL | 2024-11-30 |
| 926 | Perception Test 2024: Challenge Summary and A Novel Hour-Long VideoQA Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We summarise in this report the challenge tasks and results, and introduce in detail the novel hour-long video QA benchmark 1h-walk VQA. |
Joseph Heyward; João Carreira; Dima Damen; Andrew Zisserman; Viorica Pătrăucean; | arxiv-cs.CV | 2024-11-29 |
| 927 | TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing benchmarks primarily focus on single-table QA, failing to capture the intricacies of reasoning across multiple relational tables, as required in real-world domains such as finance, healthcare, and e-commerce. To address this gap, we present TQA-Bench, a new multi-table QA benchmark designed to evaluate the capabilities of LLMs in tackling complex QA tasks over relational data. |
Zipeng Qiu; You Peng; Guangxin He; Binhang Yuan; Chen Wang; | arxiv-cs.AI | 2024-11-29 |
| 928 | Actions and Objects Pathways for Domain Adaptation in Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Actions and Objects Pathways (AOPath) for out-of-domain generalization in video question answering tasks. |
Safaa Abdullahi Moallim Mohamud; Ho-Young Jung; | arxiv-cs.CV | 2024-11-28 |
| 929 | Understanding The Design Decisions of Retrieval-Augmented Generation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis provides evidence-based guidance forpractitioners and establishes foundational insights for principled RAGdeployment. |
SHENGMING ZHAO et. al. | arxiv-cs.SE | 2024-11-28 |
| 930 | Overview of TREC 2024 Biomedical Generative Retrieval (BioGen) Track Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods for grounding generated statements in reliable sources along with practical evaluation approaches are needed to overcome this barrier. Towards this, in our pilot task organized at TREC 2024, we introduced the task of reference attribution as a means to mitigate the generation of false statements by LLMs answering biomedical questions. |
Deepak Gupta; Dina Demner-Fushman; William Hersh; Steven Bedrick; Kirk Roberts; | arxiv-cs.IR | 2024-11-27 |
| 931 | Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of our survey is to provide an overview of the development of VQA and a detailed description of the latest models with high timeliness. |
JIAYI KUANG et. al. | arxiv-cs.CL | 2024-11-26 |
| 932 | Task Progressive Curriculum Learning for Robust Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show for the first time that robust Visual Question Answering is attainable by simply enhancing the training strategy. |
AHMED AKL et. al. | arxiv-cs.CV | 2024-11-26 |
| 933 | Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Text-guided Coarse-to-Fine Fusion Network (TGFNet), which leverages the semantic relationships between question text and multi-source images to guide the network toward complementary fusion at the feature level. |
ZHICHENG ZHAO et. al. | arxiv-cs.CV | 2024-11-24 |
| 934 | AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In thiswork, we introduce AfriMed-QA, the first large scale Pan-African Englishmulti-specialty medical Question-Answering (QA) dataset, 15,000 questions (openand closed-ended) sourced from over 60 medical schools across 16 countries,covering 32 medical specialties. |
TOBI OLATUNJI et. al. | arxiv-cs.CL | 2024-11-23 |
| 935 | Adversarial Sample Synthesis for Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Chuanhao Li; Chenchen Jing; Zhen Li; Yuwei Wu; Yunde Jia; | ACM Transactions on Multimedia Computing, Communications … | 2024-11-21 |
| 936 | KTMN: Knowledge-driven Two-stage Modulation Network for Visual Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View |
Jingya Shi; Dezhi Han; Chongqing Chen; Xiang Shen; | Multim. Syst. | 2024-11-20 |
| 937 | Retrieval-Augmented Generation for Domain-Specific Question Answering: A Case Study on Pittsburgh and CMU Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We designed a Retrieval-Augmented Generation (RAG) system to provide large language models with relevant documents for answering domain-specific questions about Pittsburgh and Carnegie Mellon University (CMU). |
Haojia Sun; Yaqi Wang; Shuting Zhang; | arxiv-cs.LG | 2024-11-20 |
| 938 | Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate how simple, training-free, token-level disambiguation methods may be effectively used to improve LLM performance for ambiguous question answering tasks. |
Aryan Keluskar; Amrita Bhattacharjee; Huan Liu; | arxiv-cs.CL | 2024-11-19 |
| 939 | Exploring and Exploiting Model Uncertainty for Robust Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
XUESONG ZHANG et. al. | Multim. Syst. | 2024-11-19 |
| 940 | Mitigating Knowledge Conflicts in Language Model-Driven Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We specifically target a common hallucination pattern in question answering, examining how the correspondence between entities and their contexts during model training influences the system’s performance at inference time. |
HAN CAO et. al. | arxiv-cs.CL | 2024-11-18 |
| 941 | Memory-Augmented Multimodal LLMs for Surgical VQA Via Self-Contained Inquiry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods often struggle with limited scene understanding and question comprehension, and some rely on external resources (e.g., pre-extracted object features), which can introduce errors and generalize poorly across diverse surgical environments. To address these challenges, we propose SCAN, a simple yet effective memory-augmented framework that leverages Multimodal LLMs to improve surgical context comprehension via Self-Contained Inquiry. |
WENJUN HOU et. al. | arxiv-cs.CV | 2024-11-16 |
| 942 | Understanding Multimodal LLMs: The Mechanistic Interpretability of Llava in Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we apply mechanistic interpretability methods to analyze the visual question answering (VQA) mechanisms in the first MLLM, Llava. |
Zeping Yu; Sophia Ananiadou; | arxiv-cs.CL | 2024-11-16 |
| 943 | An Evidence-based Approach for Open-domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View |
Parastoo Jafarzadeh; F. Ensan; | Knowl. Inf. Syst. | 2024-11-15 |
| 944 | A Benchmark for Long-Form Medical Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new publicly available benchmark featuring real-world consumer medical questions with long-form answer evaluations annotated by medical doctors. |
PEDRAM HOSSEINI et. al. | arxiv-cs.CL | 2024-11-14 |
| 945 | Improving Ranking-based Question Answering with Weak Supervision for Low-resource Qur’anic Texts Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This work tackles the challenge of ranking-based machine reading comprehension (MRC), where a question answering (QA) system generates a ranked list of relevant answers for each … |
Mohammed ElKomy; Amany Sarhan; | Artif. Intell. Rev. | 2024-11-14 |
| 946 | Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses this gap by providing a comprehensive evaluation framework for medical question-answering (QA) systems in a RAG setting for these situations, including sufficiency, integration, and robustness. We introduce Medical Retrieval-Augmented Generation Benchmark (MedRGB) that provides various supplementary elements to four medical QA datasets for testing LLMs’ ability to handle these specific scenarios. |
Nghia Trung Ngo; Chien Van Nguyen; Franck Dernoncourt; Thien Huu Nguyen; | arxiv-cs.CL | 2024-11-14 |
| 947 | Exploring State-of-the-Art LLMs from BERT to XLNet: A Study Over Question Answering Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In recent years, both academia and industry have witnessed significant advancements in Large Language Models (LLMs) research, with models like ChatGPT garnering extensive … |
L. M. P. Navarro; E. Batista; Marco Pacheco; | 2024 IEEE Latin American Conference on Computational … | 2024-11-13 |
| 948 | The Limited Impact of Medical Adaptation of Large Language and Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we compare ten medical LLMs and two VLMs against their corresponding base models, arriving at a different conclusion: all medical VLMs and nearly all medical LLMs fail to consistently improve over their base models in the zero-/few-shot prompting and supervised fine-tuning regimes for medical question answering (QA). |
Daniel P. Jeong; Pranav Mani; Saurabh Garg; Zachary C. Lipton; Michael Oberst; | arxiv-cs.CL | 2024-11-13 |
| 949 | Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the \textit{Extract-Refine-Retrieve-Read} (ERRR) framework, anovel approach designed to bridge the pre-retrieval information gap inRetrieval-Augmented Generation (RAG) systems through query optimizationtailored to meet the specific knowledge requirements of Large Language Models(LLMs). |
Youan Cong; Pritom Saha Akash; Cheng Wang; Kevin Chen-Chuan Chang; | arxiv-cs.CL | 2024-11-12 |
| 950 | Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces QA-Attack (Question Answering Attack), a novel word-level adversarial strategy that fools QA models. |
Jiyao Li; Mingze Ni; Yongshun Gong; Wei Liu; | arxiv-cs.CL | 2024-11-12 |
| 951 | Revisiting Automated Evaluation for Long-form Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LFTQA-Eval, a meta-evaluation dataset comprising 2,988 human-annotated examples, to rigorously assess the efficacy of current automated metrics in assessing LLM-based LFTQA systems, with a focus on faithfulness and comprehensiveness. |
Yuqi Wang; Lyuhao Chen; Songcheng Cai; Zhijian Xu; Yilun Zhao; | emnlp | 2024-11-11 |
| 952 | SparrowVQE: Visual Question Explanation for Course Content Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to advance the field by introducing Visual Question Explanation (VQE), which enhances the ability of VQA to provide detailed explanations rather than brief responses and address the need for more complex interaction with visual content. |
Jialu Li; Manish Kumar Thota; Ruslan Gokhman; Radek Holik; Youshan Zhang; | arxiv-cs.CV | 2024-11-11 |
| 953 | CommVQA: Situating Visual Question Answering in Communicative Contexts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate how situating images within naturalistic contexts shapes visual questions, we introduce CommVQA, a VQA dataset consisting of images, image descriptions, real-world communicative scenarios where the image might appear (e. g. , a travel website), and follow-up questions and answers conditioned on the scenario and description. |
Nandita Shankar Naik; Christopher Potts; Elisa Kreiss; | emnlp | 2024-11-11 |
| 954 | Large Language Models Are Poor Clinical Decision-Makers: A Comprehensive Benchmark IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better understand LLMs in the clinic, we construct a benchmark ClinicBench. |
FENGLIN LIU et. al. | emnlp | 2024-11-11 |
| 955 | Training-free Deep Concept Injection Enables Language Models for Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we make the first attempt to demonstrate that the PLM is able to perform zero-shot crossmodal tasks without any crossmodal pretraining, when the observed visual concepts are injected as both additional input text tokens and augmentation in the intermediate features within each feed-forward network for the PLM. |
Xudong Lin; Manling Li; Richard Zemel; Heng Ji; Shih-Fu Chang; | emnlp | 2024-11-11 |
| 956 | Pre-training Cross-lingual Open Domain Question Answering with Large-scale Synthetic Supervision Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View |
Fan Jiang; Tom Drummond; Trevor Cohn; | emnlp | 2024-11-11 |
| 957 | MILD Bot: Multidisciplinary Childhood Cancer Survivor Question-Answering Bot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a Multidisciplinary chILDhood cancer survivor question-answering (MILD) bot designed to support childhood cancer survivors facing diverse challenges in their survivorship journey. |
MIRAE KIM et. al. | emnlp | 2024-11-11 |
| 958 | Toward Optimal Search and Retrieval for RAG Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we work towards the goal of understanding how retrievers can be optimized for RAG pipelines for common tasks such as Question Answering (QA). |
ALEXANDRIA LETO et. al. | arxiv-cs.CL | 2024-11-11 |
| 959 | You Make Me Feel Like A Natural Question: Training QA Systems on Transformed Trivia Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Training question-answering QA and information retrieval systems for web queries require large, expensive datasets that are difficult to annotate and time-consuming to gather. … |
TASNIM KABIR et. al. | emnlp | 2024-11-11 |
| 960 | EfficientRAG: Efficient Retriever for Multi-Hop Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce EfficientRAG, an efficient retriever for multi-hop question answering. |
ZIYUAN ZHUANG et. al. | emnlp | 2024-11-11 |
| 961 | Encoding and Controlling Global Semantics for Long-form Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further enhance the controllability, we introduce a cross-modal compositional congruence objective to encourage global semantics aligned with the question. |
THONG THANH NGUYEN et. al. | emnlp | 2024-11-11 |
| 962 | CompAct: Compressing Retrieved Documents Actively for Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Context compression tackles this issue by filtering out irrelevant information, but current methods still struggle in realistic scenarios where crucial information cannot be captured with a single-step approach. To overcome this limitation, we introduce CompAct, a novel framework that employs an active strategy to condense extensive documents without losing key information. |
Chanwoong Yoon; Taewhoo Lee; Hyeon Hwang; Minbyul Jeong; Jaewoo Kang; | emnlp | 2024-11-11 |
| 963 | SciDQA: A Deep Reading Comprehension Dataset Over Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SciDQA, a new dataset for reading comprehension that challenges language models to deeply understand scientific articles, consisting of 2,937 QA pairs. |
Shruti Singh; Nandan Sarkar; Arman Cohan; | emnlp | 2024-11-11 |
| 964 | ERVQA: A Dataset to Benchmark The Readiness of Large Vision Language Models in Hospital Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the Emergency Room Visual Question Answering (ERVQA) dataset, consisting of |
SOURJYADIP RAY et. al. | emnlp | 2024-11-11 |
| 965 | Self-Bootstrapped Visual-Language Model for Knowledge Selection and Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, the retrieved knowledge is not truly conducive to helping answer the question, affecting the performance of the overall system. To address this issue, we propose a novel framework that leverages the visual-language model to select the key knowledge retrieved by DPR and answer questions. |
Dongze Hao; Qunbo Wang; Longteng Guo; Jie Jiang; Jing Liu; | emnlp | 2024-11-11 |
| 966 | Efficient Answer Retrieval System (EARS): Combining Local DB Search and Web Search for Generative QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an efficient answer retrieval system **EARS**: a production-ready, factual question answering (QA) system that combines local knowledge base search with generative, context-based QA. |
Nikita Krayko; Ivan Sidorov; Fedor Laputin; Daria Galimzianova; Vasily Konovalov; | emnlp | 2024-11-11 |
| 967 | Unlocking Markets: A Multilingual Benchmark to Cross-Market Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a large-scale dataset comprising over 7 million questions from 17 marketplaces across 11 languages. |
Yifei Yuan; Yang Deng; Anders S�gaard; Mohammad Aliannejadi; | emnlp | 2024-11-11 |
| 968 | OMG-QA: Building Open-Domain Multi-Modal Generative Question Answering Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce OMG-QA, a new resource for question answering that is designed to evaluate the effectiveness of question answering systems that perform retrieval augmented generation (RAG) in scenarios that demand reasoning on multi-modal, multi-document contexts. |
LINYONG NAN et. al. | emnlp | 2024-11-11 |
| 969 | Empowering Large Language Model for Continual Video Question Answering with Collaborative Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the novel challenge of VideoQA within a continual learning framework, and empirically identify a critical issue: fine-tuning a large language model (LLM) for a sequence of tasks often results in catastrophic forgetting. |
CHEN CAI et. al. | emnlp | 2024-11-11 |
| 970 | Adaptive Question Answering: Enhancing Language Model Proficiency for Addressing Knowledge Conflicts with Source Citations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the importance of both aspects, no prior research has combined them, leaving a significant gap in the development of QA systems. In this work, we bridge this gap by proposing the novel task of QA with source citation in ambiguous settings, where multiple valid answers exist. |
Sagi Shaier; Ari Kobren; Philip V. Ogren; | emnlp | 2024-11-11 |
| 971 | Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing benchmarks employ irrelevant noise texts to artificially extend the length of test cases, diverging from the real-world scenarios of long-context applications. To bridge this gap, we propose a novel long-context benchmark, Loong, aligning with realistic scenarios through extended multi-document question answering (QA). |
MINZHENG WANG et. al. | emnlp | 2024-11-11 |
| 972 | Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MIRAGE – Model Internals-based RAG Explanations – a plug-and-play approach using model internals for faithful answer attribution in RAG applications. |
Jirui Qi; Gabriele Sarti; Raquel Fern�ndez; Arianna Bisazza; | emnlp | 2024-11-11 |
| 973 | StorySparkQA: Expert-Annotated QA Pairs with Real-World Knowledge for Children’s Story-Based Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This limitation can be attributed to the existing question-answering (QA) datasets used for children’s education, upon which the systems are built, failing to capture the nuances of how education experts think when conducting interactive story reading activities. To bridge this gap, we design an annotation framework, empowered by existing knowledge graph to capture experts’ annotations and thinking process, and leverage this framework to construct StorySparkQA dataset, which comprises 5, 868 expert-annotated QA pairs with real-world knowledge. |
JIAJU CHEN et. al. | emnlp | 2024-11-11 |
| 974 | Multi-Level Information Retrieval Augmented Generation for Knowledge-based Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a multi-level information RAG approach that enhances answer generation through entity retrieval and query expansion. |
Adjali Omar; Olivier Ferret; Sahar Ghannay; Herv� Le Borgne; | emnlp | 2024-11-11 |
| 975 | A Simple LLM Framework for Long-Range Video Question-Answering IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LLoVi, a simple yet effective **L**anguage-based **Lo**ng-range **Vi**deo question-answering (LVQA) framework. |
CE ZHANG et. al. | emnlp | 2024-11-11 |
| 976 | REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the extensive efforts on RAG research, in existing methods, LLMs cannot precisely assess the relevance of retrieved documents, thus likely leading to misleading or even incorrect utilization of external knowledge (i. e. , retrieved documents). To address this issue, in this paper, we propose REAR, a RElevance-Aware Retrieval-augmented approach for open-domain question answering (QA). |
YUHAO WANG et. al. | emnlp | 2024-11-11 |
| 977 | RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, most existing datasets for this task are either constructed using a single source corpus or consist of short extractive answers, which fall short of evaluating large language model (LLM) based RAG-QA systems on cross-domain generalization. To address these limitations, we create Long-form RobustQA (LFRQA), a new dataset comprising human-written long-form answers that integrate short extractive answers from multiple documents into a single, coherent narrative, covering 26K queries and large corpora across seven different domains. |
RUJUN HAN et. al. | emnlp | 2024-11-11 |
| 978 | Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we propose Right for Right Reasons (R3), a commonsense KGQA methodology that allows for a verifiable reasoning procedure by axiomatically surfacing intrinsic commonsense knowledge of LLMs and grounding every factual reasoning step on KG triples. |
Armin Toroghi; Willis Guo; Mohammad Mahdi Abdollah Pour; Scott Sanner; | emnlp | 2024-11-11 |
| 979 | Generate-on-Graph: Treat LLM As Both Agent and KG for Incomplete Knowledge Graph Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To handle IKGQA, we propose a training-free method called Generate-on-Graph (GoG), which can generate new factual triples while exploring KGs. |
YAO XU et. al. | emnlp | 2024-11-11 |
| 980 | PCQPR: Proactive Conversational Question Planning with Reflection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we redefine the CQG task as Conclusion-driven Conversational Question Generation (CCQG) by focusing on proactivity, not merely reacting to the unfolding conversation but actively steering it towards a conclusion-oriented question-answer pair. To address this, we propose a novel approach, called Proactive Conversational Question Planning with self-Refining (PCQPR). |
Shasha Guo; Lizi Liao; Jing Zhang; Cuiping Li; Hong Chen; | emnlp | 2024-11-11 |
| 981 | Do Great Minds Think Alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements of large language models (LLMs)have led to claims of AI surpassing humansin natural language processing NLP tasks such as textual understanding and reasoning. %This work investigates these assertions by introducingCAIMIRA, a novel framework rooted in item response theory IRTthat enables quantitative assessment and comparison of problem-solving abilities inquestion-answering QA agents. |
Maharshi Gor; Hal Daum� Iii; Tianyi Zhou; Jordan Lee Boyd-Graber; | emnlp | 2024-11-11 |
| 982 | Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing methods, like concatenation or free-form textual conversion of triples, have limitations, including duplicated entities or relations, reduced evidence density, and failure to highlight crucial evidence. To address these issues, we propose EFSum, an Evidence-focused Fact Summarization framework for enhanced QA with knowledge-augmented LLMs. |
Sungho Ko; Hyunjin Cho; Hyungjoo Chae; Jinyoung Yeo; Dongha Lee; | emnlp | 2024-11-11 |
| 983 | Towards Faithful Knowledge Graph Explanation Through Deep Alignment in Commonsense Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify confounding effects and LM-KG misalignment as key factors causing spurious explanations. To address this, we introduce the LM-KG Fidelity metric to assess KG representation reliability and propose the LM-KG Distribution-aware Alignment (LKDA) algorithm to improve explanation faithfulness. |
Weihe Zhai; Arkaitz Zubiaga; Bingquan Liu; Chengjie Sun; Yalong Zhao; | emnlp | 2024-11-11 |
| 984 | Triad: A Framework Leveraging A Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Triad, a unified framework that utilizes an LLM-based agent with multiple roles for KBQA tasks. |
CHANG ZONG et. al. | emnlp | 2024-11-11 |
| 985 | EVQAScore: A Fine-grained Metric for Video Question Answering Data Quality Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although various methods have been proposed for assessing video caption quality, there remains a lack of dedicated evaluation methods for Video QA. To address this gap, we introduce EVQAScore, a reference-free method that leverages keyword extraction to assess both video caption and video QA data quality. |
Hao Liang; Zirong Chen; Hejun Dong; Wentao Zhang; | arxiv-cs.CV | 2024-11-11 |
| 986 | RAC: Retrieval-augmented Conversation Dataset for Open-domain Question Answering in Conversational Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel retrieval-augmented conversation (RAC) dataset and develop a baseline system comprising query rewriting, retrieval, reranking, and response generation stages. |
Bonggeun Choi; JeongJae Park; Yoonsung Kim; Jaehyun Park; Youngjoong Ko; | emnlp | 2024-11-11 |
| 987 | Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the effectiveness of using LLMs in generating culturally relevant commonsense QA datasets for Indonesian and Sundanese languages. |
Rifki Afina Putri; Faiz Ghifari Haznitrama; Dea Adhista; Alice Oh; | emnlp | 2024-11-11 |
| 988 | Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we compare seven public medical LLMs and two VLMs against their corresponding base models, arriving at a different conclusion: all medical VLMs and nearly all medical LLMs fail to consistently improve over their base models in the zero-/few-shot prompting regime for medical question-answering (QA) tasks. |
Daniel P Jeong; Saurabh Garg; Zachary Chase Lipton; Michael Oberst; | emnlp | 2024-11-11 |
| 989 | LONGAGENT: Achieving Question Answering for 128k-Token-Long Documents Through Multi-Agent Collaboration IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce _LongAgent_, a multi-agent collaboration method that enables efficient and effective QA over 128k-token-long documents. |
JUN ZHAO et. al. | emnlp | 2024-11-11 |
| 990 | Cross-lingual Transfer for Automatic Question Generation By Learning Interrogative Structures in Target Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple and efficient XLT-QG method that operates without the need for monolingual, parallel, or labeled data in the target language, utilizing a small language model. |
Seonjeong Hwang; Yunsu Kim; Gary Lee; | emnlp | 2024-11-11 |
| 991 | RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a weakly supervised method for training the RE simply utilizing question-answer data without any labels for correct contexts. |
Kiseung Kim; Jay-Yoon Lee; | emnlp | 2024-11-11 |
| 992 | ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these methods require additional training, hand-crafted templates or human-written explanations. To address these issues, we introduce ZEBRA, a zero-shot question answering framework that combines retrieval, case-based reasoning and introspection and dispenses with the need for additional training of the LLM. |
Francesco Maria Molfese; Simone Conia; Riccardo Orlando; Roberto Navigli; | emnlp | 2024-11-11 |
| 993 | PDFTriage: Question Answering Over Long, Structured Documents IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When a system has to query the document for context, this incongruity is brought to the fore, and seemingly trivial questions can trip up the QA system. To bridge this fundamental gap in handling structured documents, we propose an approach called PDFTriage that enables models to retrieve the context based on either structure or content. |
JON SAAD-FALCON et. al. | emnlp | 2024-11-11 |
| 994 | GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, for practical deployment, it is crucial to perform knowledge distillation to maintain high performance while operating under computational constraints. In this paper, we address a key question: given the importance of unsupervised distillation for student model performance, how can knowledge from multiple teacher models be effectively ensemble during this stage without the guidance of labels? |
WENJIE ZHOU et. al. | emnlp | 2024-11-11 |
| 995 | TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Currently, existing methods perform all of these steps in a single pass without being able to adapt if insufficient or incorrect information is collected. To overcome this, we introduce a modular multi-LMM agent framework based on several agents with different roles, instructed by a Planner agent that updates its instructions using shared feedback from the other agents. |
Chuyi Shang; Amos You; Sanjay Subramanian; Trevor Darrell; Roei Herzig; | emnlp | 2024-11-11 |
| 996 | DVD: Dynamic Contrastive Decoding for Knowledge Amplification in Multi-Document Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Retrieval-augmented generation (RAG) offers a potential remedy, yet the uneven retrieval quality and irrelevant contents may distract LLMs. In this work, we address these issues at the generation phase by treating RAG as a multi-document QA task. |
Jing Jin; Houfeng Wang; Hao Zhang; Xiaoguang Li; Zhijiang Guo; | emnlp | 2024-11-11 |
| 997 | Towards Efficient Dataset Development: A Case Study of M2Q2+ in Movie QA Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The development and availability of high-quality datasets are critical to the success of machine learning and artificial intelligence (AI) systems. Datasets provide the essential … |
A. Aggoune; Zakaria Mihoubi; | 2024 International Conference on Advanced Aspects of … | 2024-11-09 |
| 998 | GUIDEQ: Framework for Guided Questioning for Progressive Informational Collection and Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our work, GUIDEQ, presents a novel framework for asking guided questions to further progress a partial information. |
Priya Mishra; Suraj Racha; Kaustubh Ponkshe; Adit Akarsh; Ganesh Ramakrishnan; | arxiv-cs.CL | 2024-11-08 |
| 999 | SaSR-Net: Source-Aware Semantic Representation Network for Enhancing Audio-Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Source-aware Semantic Representation Network (SaSR-Net), a novel model designed for AVQA. |
TIANYU YANG et. al. | arxiv-cs.CV | 2024-11-07 |
| 1000 | MEG: Medical Knowledge-Augmented Large Language Models for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present MEG, a parameter-efficient approach for medical knowledge-augmented LLMs. |
Laura Cabello; Carmen Martin-Turrero; Uchenna Akujuobi; Anders Søgaard; Carlos Bobed; | arxiv-cs.CL | 2024-11-06 |