Paper Digest: Recent Papers on Semantic Segmentation
Paper Digest Team extracted all recent Semantic Segmentation related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
Since 2018, Paper Digest has built a foundation of data spanning decades of conferences, journals, and research topics. The platform features a daily digest service that sifts through tens of thousands of new papers, clinical trials, news articles, and community posts, filtering the noise to highlight what matters most to specific interests. Beyond daily updates, dozens of built-in research tools streamline the academic workflow, supporting efficient reading and writing, comprehensive literature reviews, and automated research report generation.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: Recent Papers on Semantic Segmentation
| Paper | Author(s) | Source | Date | |
|---|---|---|---|---|
| 1 | DyMamba: Dynamic Mamba for Microscopy Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Results In this article, we propose DyMamba, a Mamba-based model featuring a dynamic scanning strategy that adaptively plans scanning paths based on local features and complexity. |
BUQING CAI et. al. | Bioinformatics | 2026-06-16 |
| 2 | Label Tree Semantic Losses for Rich Multi-class Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose two tree-based semantic loss functions which take advantage of a hierarchical organization of the labels. |
JUNWEN WANG et. al. | Frontiers in Artificial Intelligence | 2026-06-16 |
| 3 | ActiveSAM: Image-Conditional Class Pruning for Fast and Accurate Open-Vocabulary Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ActiveSAM, a training-free, zero-shot inference framework that turns SAM 3 into an active-vocabulary segmenter. |
Tran Dinh Tien; Zhiqiang Shen; | arxiv-cs.CV | 2026-06-15 |
| 4 | Attention-Based Prototype Calibration for Multi-Rater Few-Shot Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an attention-based prototype calibration framework for few-shot multi-rater segmentation that models rater-specific deviations from a consensus representation in prototype space. |
Truong Vu; Minh Khoi Ho; Yutong Xie; | arxiv-cs.CV | 2026-06-15 |
| 5 | Texture-Shape Bias Balancing for Robust Synthetic-to-Real Semantic Segmentation in Automotive NIR Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a generative augmentation framework that transforms synthetic images into realistic NIR-style variants via our introduced target style adaptation (TSA). |
FELIX STILLGER et. al. | arxiv-cs.CV | 2026-06-12 |
| 6 | Visualizing Image Segmentation Network Behavior Through The Lens of Scale Space Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a visual analytics framework that supports a systematic qualitative and quantitative investigation of how segmentations evolve across scale space. |
A. C. Mikliss; T. Schultz; | Computer Graphics Forum | 2026-06-11 |
| 7 | LASA: A Weak Supervision Method for Open-Vocabulary Scene Sketch Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This suggests that cross-layer aggregation provides a more robust structural prior than any individual layer alone. Leveraging this insight, we propose a structure-aware framework built upon \textbf{L}ayer-wise \textbf{A}ccumulated \textbf{S}tructural \textbf{A}ttention (\textbf{LASA}), which aggregates multi-layer attention to guide hierarchical semantic alignment under weak supervision and refine predictions during inference. |
Liwen Yi; Xianlin Zhang; Yue Zhang; Yue Ming; Xueming Li; | arxiv-cs.CV | 2026-06-10 |
| 8 | PT-WNO: Point Transformer with Wavelet Neural Operator for 3D Point Cloud Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose Point Transformer with Wavelet Neural Operato (PT-WNO), which integrates a shared Wavelet Neural Operator (WNO) branch alongside the skip connections of a point cloud transformer backbone. |
Nhut Le; Maryam Rahnemoonfar; | arxiv-cs.CV | 2026-06-09 |
| 9 | SAM-SP: SAM-guided Semantic Alignment and Pseudo-label Fine-tuning for Scribble-supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YANG LI et. al. | Biomedical Signal Processing and Control | 2026-06-09 |
| 10 | SBEM-UNet: A Semantic Boundary and Contour-Enhanced Framework for Semisupervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Hongwei Zhang; Kaijun Yang; Meifeng Shi; Keke Xu; Guolong Wang; | Journal of Imaging Informatics in Medicine | 2026-06-08 |
| 11 | Sub-Semantic Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To do that, we couple a general-purpose vision-language model to SAM 3, a promptable segmentation backbone whose native text pathway can ground rich descriptions into masks. Simple coupling fails for a number of reasons that we identify in the paper, and we overcome them by introducing DETECTURE that resolves three concrete failure modes — language leakage between texture regions, prompt competition inside the segmentation backbone, and semantic distortion at the language-to-mask interface. |
Aviad Cohen Zada; Nadav Orenstein; Shai Avidan; Gal Oren; | arxiv-cs.CV | 2026-06-07 |
| 12 | Spectral–spatial Collaborative Perception and Dynamic Semantic Reasoning for Remote Sensing Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chenshuai Bai; Kaijun Wu; Xiaofeng Bai; Xiaoqiang Wu; | Engineering Applications of Artificial Intelligence | 2026-06-06 |
| 13 | An Ensemble of Deep Learning Models Using A Genetic Algorithm for Improving Bird Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The present study proposes a novel method of ensembling deep learning models using Genetic Algorithm (GA) for improving bird image segmentation in a complex background. |
B. S. Chandrashekar; H. S. Nagendraswamy; M. P. Pavan Kumar; | Engineering, Technology & Applied Science Research | 2026-06-06 |
| 14 | Learning A Semantic Calibration Network for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Semantic Calibration Network (SCN) for open-vocabulary semantic segmentation. |
Yang Sun; Tao Wang; Anastasia Ioannou; Ge Xu; | arxiv-cs.CV | 2026-06-06 |
| 15 | Scene-Centric Unsupervised Video Panoptic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose VideoCUPS, the first unsupervised VPS approach. |
CHRISTOPH REICH et. al. | arxiv-cs.CV | 2026-06-03 |
| 16 | Human‐in‐the‐Loop Object Segmentation for 3D Gaussian Splatting Via Finger‐based VR Interface Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, these methods significantly struggle with objects with multiple components and occluded scenes. To address these limitations, we propose an interactive human‐in‐the‐loop segmentation framework that combines a fast optimization‐based 3D segmentation algorithm with intuitive finger‐based user interactions within a virtual reality environment. |
YONGSEOK LEE et. al. | Advanced Intelligent Systems | 2026-06-01 |
| 17 | A Survey on Active Learning in Visual Semantic Segmentation: Significance, Challenges, and Prospects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper provides a systematic categorization and comprehensive review of active learning strategies in visual semantic segmentation. |
ZONGYI XU et. al. | ACM Computing Surveys | 2026-06-01 |
| 18 | SWARD: Stochastic Window-Attention-Based Relational Distillation for Cross-Architectural Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing distillation methods largely assume architectural homogeneity and rely on direct feature mimicry, which fails to bridge this representational gap and neglects the structured spatial dependencies and discriminative organization required for accurate semantic segmentation. In this paper, we propose SWARD, a knowledge distillation framework that addresses this gap through two complementary mechanisms. |
Aditya Makineni; Qing Tian; | arxiv-cs.CV | 2026-05-31 |
| 19 | Flexible Control of 3D CT Generation Via Text and Semantically-Defined Segmentation Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a flexible multimodal framework for controllable volumetric image generation that supports input from radiology reports and segmentation prompts (both optional). |
Weicheng Dai; Chenyu Wang; Andy Li; Shantanu Ghosh; Kayhan Batmanghelich; | arxiv-cs.CV | 2026-05-30 |
| 20 | GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper describes the repository state, the mathematical kernel implemented in geosam3d.propagate, the feature head trained from Segment Anything masks, and the validation already present in the codebase. |
Arun Sharma; | arxiv-cs.CV | 2026-05-29 |
| 21 | Vanilla ViT for Automotive Point Cloud Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we show how to effectively leverage vanilla, non-hierarchical ViTs for segmentation of large-scale automotive lidar scenes. |
GILLES PUY et. al. | arxiv-cs.CV | 2026-05-29 |
| 22 | MSP-UNet: A Measurement-oriented Real-time Framework for Robust Underwater Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, severe optical degradations, including turbidity, light attenuation, and low contrast, significantly compromise boundary localization accuracy and region measurement consistency under constrained computational resources. To address these challenges, we propose MSP-UNet, a measurement-oriented and real-time segmentation framework that integrates multi-scale convolutional spatial priors with a frozen DINOv3 backbone. |
Yiming Xing; Bai Liu; Zhenzhou Liu; Na Guo; | Measurement Science and Technology | 2026-05-29 |
| 23 | CMFA-Net: A CNN–Mamba Collaborative Feature Alignment Network for Robust Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Medical image segmentation still faces three critical challenges: insufficient joint modeling of local details and long-range dependencies, the high computational burden of transformer-based architectures for high-resolution inputs, and performance degradation caused by domain shift across imaging centers and acquisition devices. To address these issues, this paper proposes CMFA-Net, a CNN–Mamba collaborative feature alignment network for robust medical image segmentation. |
Liu Yang; Hui Wang; Xiaolin Fu; Yang Wang; Duohai Wu; | Electronics | 2026-05-28 |
| 24 | AI-Driven Window Detection and Semantic Segmentation from Street View Imagery Using Grounding DINO and DeepLabV3 for Digital Twin Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The proposed pipeline demonstrates the potential of combining zero-shot detection and semantic segmentation for automated façade analysis from street-view imagery. |
Sumeer Koirala; Xiaoxiang Zhu; Yao Sun; Alejandro Rueda Segura; | Journal on Geoinformatics, Nepal | 2026-05-28 |
| 25 | DGSG-Mind: Dynamic 3D Gaussian Scene Graphs for Long-Term Scene Understanding and Grounding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, current 3D scene understanding methods either rely on simple feature matching without explicit spatial reasoning or assume offline ground-truth 3D geometry. To address these challenges, we present DGSG-Mind, a hybrid instance-aware 3D Gaussian dynamic scene graph system with an embodied reasoning agent. |
Luzhou Ge; Xiangyu Zhu; Jinyan Liu; Xuesong Li; | arxiv-cs.CV | 2026-05-28 |
| 26 | FoundObj: Self-supervised Foundation Models As Rewards for Label-free 3D Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present FoundObj, a novel framework featuring a superpoint-based object discovery agent that incrementally merges suitable neighboring superpoints, guided by our innovative semantic and geometric reward modules. |
ZIHUI ZHANG et. al. | arxiv-cs.CV | 2026-05-26 |
| 27 | TrackRef3D: Multi-View Consistent Track-then-Label for Open-World Referring Segmentation in 3D Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods typically rely on expensive per-scene manual annotation and per-view pseudo mask generation, which suffer from multi-view inconsistency and poor generalization to varying query specificities. To address this, we present TrackRef3D, a fully automatic pipeline that achieves open-world referring segmentation in 3D Gaussian Splatting (3DGS) without manual annotation by introducing a multi-view consistent track-then-label paradigm that fundamentally decouples object discovery from semantic grounding. |
Yuyang Tan; Renhe Zhang; Hang Zhang; Ao Li; Xin Tan; | arxiv-cs.CV | 2026-05-26 |
| 28 | Deep Learning-Based Object Detection and Segmentation Methods: A Narrative Review Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a comprehensive narrative review of deep learning-based methods for object detection and segmentation, tracing the evolution from seminal convolutional architectures to contemporary transformer-based frameworks and foundation models. |
Yipin Wang; | Journal of Engineering Research and Reports | 2026-05-26 |
| 29 | ATV-Net: Adaptive Triple-View Network with Dynamic Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose ATV-Net, an Adaptive Triple-View Network that strengthens a ResNet-101 backbone using three simple but complementary receptive-field views. |
Hsin-Jui Pan; Sheng-Wei Chan; Meng-Qian Li; Chun-Po Shen; | arxiv-cs.CV | 2026-05-25 |
| 30 | SFR-Net: Learning Scale-Frustum Representations for Ultra-Wide Area Remote Sensing Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel segmentation task targeting ultra-wide area (UWA) remote sensing images, characterized by both a large pixel count and extremely wide geographical coverage. |
CHUYU ZHONG et. al. | arxiv-cs.CV | 2026-05-25 |
| 31 | D3S2: Diffusion-Guided Dataset Distillation for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we identify three key challenges for segmentation DD: (i) long-tailed class imbalance, (ii) the need for strict pixel-wise alignment between images and dense labels, and (iii) the high computational cost of optimizing high-resolution data with complex models. |
Wenjie Zheng; Haoji Hu; Jiali Lu; Xingze Zou; Jing Wang; | arxiv-cs.CV | 2026-05-24 |
| 32 | GIBLy: Improving 3D Semantic Segmentation Through An Architecture-Agnostic Lightweight Geometric Inductive Bias Layer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce GIBLy, a lightweight geometric inductive bias layer that integrates learnable geometric priors into 3D segmentation pipelines. |
Diogo Lavado; Alessandra Micheletti; Clàudia Soares; | arxiv-cs.CV | 2026-05-22 |
| 33 | Training-Free Fine-Grained Semantic Segmentations in Low Data Regimes: A FungiTastic Baseline Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a training-free two-stage framework that decouples segmentation from classification. |
Sebastian Cavada; Francesco Pelosin; Lapo Faggi; | arxiv-cs.CV | 2026-05-21 |
| 34 | ConvNeXt-FD: A Fractal-Based Deep Model for Robust Biomedical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces ConvNeXt-FD, a novel deep learning architecture for robust biomedical image segmentation, built upon a U-Net-like encoder-decoder framework leveraging the powerful ConvNeXt backbone. |
Joao Batista Florindo; Amanda Pontes de Oliveira Ornelas; | arxiv-cs.CV | 2026-05-21 |
| 35 | UTB-Net: A UNet-like Transformer with Hierarchical Feature Fusion for High-Resolution Remote Sensing Urban Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, accurate segmentation remains challenging due to the inherent difficulty in simultaneously modeling long-range global semantics and fine-grained local details-a dilemma that existing CNN-based or Transformer-based architectures fail to resolve effectively. To overcome this, we propose UTB-Net, a UNet-like Transformer that introduces a hierarchical complementary feature fusion paradigm. |
Xuan Li; Yijie Chen; Kunhong Li; | Engineering Research Express | 2026-05-20 |
| 36 | SkySeg: Collaborative Onboard Semantic Segmentation with Heterogeneous UAVs in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, deploying semantic segmentation on resource-constrained UAV platforms presents two significant challenges: 1) hardware constraints limit the ability of UAVs to perform real-time semantic segmentation, and 2) environmental variations during flight cause data distribution shifts, deviating from the original training data. To address these issues, this paper introduces SkySeg, a heterogeneous multi-UAV air-air cooperation framework that integrates computer vision and flight pattern to enable onboard semantic segmentation using low-cost sensors. |
ANQI LU et. al. | arxiv-cs.CV | 2026-05-19 |
| 37 | What Makes Synthetic Data Effective in Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In particular, synthetic images characterized by dense scene composition and fine instance fidelity demonstrate distinctive benefits, yielding significantly more discriminative spatial representations. Building on these insights, we propose SENSE, a unified framework that leverages flexible and scalable synthetic data to substantially enhance segmentation performance. |
Jinjin Zhang; Xiefan Guo; Yizhou Jin; Nan Zhou; Di Huang; | arxiv-cs.CV | 2026-05-18 |
| 38 | Best Segmentation Buddies for Image-Shape Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we examine the underexplored task of estimating segmentation-to-segmentation correspondence between images in the wild and untextured 3D shapes. |
Itai Lang; Dongwei Lyu; Dale Decatur; Rana Hanocka; | arxiv-cs.CV | 2026-05-18 |
| 39 | Unsupervised Optimization of Boundary Information Based on The Coefficient of Variation to Improve Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A method to optimize the automatic retrieval of complete and proper boundary information is proposed in this research based on an unsupervised approach. |
Cahyo Crysdian; | Register: Jurnal Ilmiah Teknologi Sistem Informasi | 2026-05-17 |
| 40 | A Survey on Assistive Vision Technologies for Visually Impaired Individuals: Approaches, Techniques, and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey reviews advances and deployment strategies for assistive vision systems that combine continuous on-device object detection with selective, high-fidelity scene segmentation and succinct audio narration. |
Sahil Pramod Warade; | International Journal for Research in Applied Science and … | 2026-05-16 |
| 41 | WOW-Seg: A Word-free Open World Segmentation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, traditional closed-set segmentation approaches struggle to adapt to complex open world scenarios, while foundation segmentation models such as SAM exhibit notable discrepancies between their strong segmentation capabilities and relatively weaker semantic understanding. To bridge these discrepancies, we propose WOW-Seg, a Word-free Open World Segmentation model for segmenting and recognizing objects from open-set categories. |
DANYANG LI et. al. | arxiv-cs.CV | 2026-05-16 |
| 42 | UniTriGen: Unified Triplet Generation of Aligned Visible-Infrared-Label for Few-Shot RGB-T Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a result, consistency among VIS, IR, and Label in spatial structure, semantic content, and cross-modal details cannot be reliably maintained. To address this issue, we propose UniTriGen, a unified triplet generation framework that directly generates spatially aligned, semantically consistent, and modality complementary VIS-IR-Label triplets under the guidance of text prompts. |
PING ZHOU et. al. | arxiv-cs.CV | 2026-05-14 |
| 43 | Binary Vegetation Mapping from High-Resolution Satellite Imagery Using Deep Learning: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper discusses a comparison between two popular deep learning models: U-Net and DeepLabV3+ when working with high- resolution images and binary semantic segmentation tasks based on high-resolution satellite images. |
Harish Kundar; | International Journal for Research in Applied Science and … | 2026-05-14 |
| 44 | MEP-Net: A PIDNet-based Model with Median-enhanced Spatial-channel Attention for Segmentation of Hepatocellular Carcinoma in CEUS Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Methods To improve segmentation performance, this study proposes an improved PIDNet-based model termed MEP-Net. |
SI-HUA YANG et. al. | BMC Medical Imaging | 2026-05-13 |
| 45 | Weakly Supervised Segmentation As Semantic-Based Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The refined foundation model then produces improved pseudo-labels, from which we train a second-stage prompt-free segmentation model. |
Stefano Colamonaco; Andrei-Bogdan Florea; Jaron Maene; | arxiv-cs.CV | 2026-05-13 |
| 46 | Towards Unified Surgical Scene Understanding:Bridging Reasoning and Grounding Via MLLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches typically address these components in isolation, leading to fragmented representations and limited semantic consistency. To address this limitation, we propose SurgMLLM, a unified surgical scene understanding framework that bridges high-level reasoning and low-level visual grounding within a single model. |
JINCAI HUANG et. al. | arxiv-cs.CV | 2026-05-13 |
| 47 | OCH3R: Object-Centric Holistic 3D Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce OCH3R, a unified framework for Object-Centric Holistic 3D Reconstruction from a single RGB image. |
Yi Du; Yang You; Xiang Wan; Leonidas Guibas; | arxiv-cs.CV | 2026-05-13 |
| 48 | LEXI-SG: Monocular 3D Scene Graph Mapping with Room-Guided Feed-Forward Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present LEXI-SG, the first dense monocular visual mapping system for open-vocabulary 3D scene graphs using only RGB camera input. |
Christina Kassab; Hyeonjae Gil; Matías Mattamala; Ayoung Kim; Maurice Fallon; | arxiv-cs.RO | 2026-05-13 |
| 49 | EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation Without Scene Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce EvObj for unsupervised 3D instance segmentation that bridges the geometric domain gap between synthetic pretraining data and real-world point clouds. |
JIAHAO CHEN et. al. | arxiv-cs.CV | 2026-05-13 |
| 50 | Semantic Segmentation Method for Sparse Point Clouds Based on Straight Flow Completion and Multi-Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These challenges hinder the practical application of point cloud semantic segmentation. To address these issues, this paper presents a novel semantic segmentation method that integrates sparse point cloud completion with multi-feature fusion. |
Tong Zheng; Zhiyuan Meng; Chongchong Yu; Tao Xie; Yewang Xu; | Sensors | 2026-05-12 |
| 51 | MPerS: Dynamic MLLM MixExperts Perception-Guided Remote Sensing Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We design a Dynamic MixExperts module that adaptively integrates the most effective textual semantics. |
Ziyi Wang; Xianping Ma; Ziyao Wang; Hongyang Zhang; Man On Pun; | arxiv-cs.CV | 2026-05-11 |
| 52 | Urban-ImageNet: A Large-Scale Multi-Modal Dataset and Evaluation Framework for Urban Space Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Urban-ImageNet, a large-scale multi-modal dataset and evaluation benchmark for urban space perception from user-generated social media imagery. |
YIWEI OU et. al. | arxiv-cs.CV | 2026-05-10 |
| 53 | Semantic Alignment in Hyperbolic Space for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose HyRo, a hyperbolic fine-tuning framework that decouples hierarchical and semantic alignment in the Poincaré ball model. |
Hoang M. Truong; Hai Nguyen-Truong; Dang Huynh; | arxiv-cs.CV | 2026-05-09 |
| 54 | Distill, Diffuse, and Semanticize (DDS): Annotation-Free 3D Scene Understanding Based on Multi-Granularity Distillation and Graph-Diffusion-Based Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an annotation-free 3D scene semantic understanding method based on multi-granularity distillation and graph-diffusion-based segmentation. |
Yijing Wang; Ruonan Li; Qilin Wang; Rongqiang Zhao; Jie Liu; | arxiv-cs.CV | 2026-05-08 |
| 55 | UniD-Shift: Towards Unified Semantic Segmentation Via Interpretable Share-Private Multimodal Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by the fact that 2D images captured by cameras are representations of the 3D world, we recognize that the features learned from 2D and 3D segmentation share some common semantics, while other aspects remain modality-specific. This insight motivates a unified multimodal framework for joint 2D-3D semantic segmentation. |
SHUAI ZHANG et. al. | arxiv-cs.CV | 2026-05-08 |
| 56 | Linear Semantic Segmentation for Low-Resource Spoken Dialects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a new multi-genre benchmark (more than 1000 samples) for semantic segmentation in conversational Arabic, focusing on dialectal discourse. |
Kirill Chirkunov; Younes Samih; Abed Alhakim Freihat; Hanan Aldarmaki; | arxiv-cs.CL | 2026-05-07 |
| 57 | From Diffusion to Rectified Flow: Rethinking Text-Based Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, we propose RLFSeg, a novel framework that leverages Rectified Flow to learn direct mapping from the image to the segmentation mask within the latent space. |
ZISHEN QU et. al. | arxiv-cs.CV | 2026-05-06 |
| 58 | FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing DIS approaches often fail to preserve fine-grained details or fully capture the semantic structure of the foreground. To address these challenges, we present FlowDIS, a novel dichotomous image segmentation method built on the flow matching framework, which learns a time-dependent vector field to transport the image distribution to the corresponding mask distribution, optionally conditioned on a text prompt. |
Andranik Sargsyan; Shant Navasardyan; | arxiv-cs.CV | 2026-05-06 |
| 59 | Looking Locally: Object-Centric Vision Transformers As Foundation Models for Efficient Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce FLIP (Fovea-Like Input Patching), a parameter-efficient vision model that realizes object segmentation through biologically-inspired top-down attention. |
Manuel Traub; Martin V Butz; | icml | 2026-05-05 |
| 60 | Look, Listen and Segment: Towards Weakly Supervised Audio-Visual Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Weakly Supervised Audio-Visual Semantic Segmentation (WSAVSS), which uses only video-level labels to generate per-frame semantic masks of sounding objects. |
C. Li; H. Huang; P. Jian; Y. Zhou; | icassp | 2026-05-04 |
| 61 | SAM-ALE: Enhancing SAM for Low-Shot Digital Pathology Semantic Segmentation Via An Auxiliary Lightweight Encoder Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While various medical SAM variants have been developed to adapt SAM to medical datasets, we observe that they still require a large amount of labeled data to perform effectively in digital pathology semantic segmentation, where annotations are particularly costly and time-consuming. To address this challenge, we propose SAM-ALE, a novel approach that enhances SAM for Low-shot digital pathology semantic segmentation via an Auxiliary Lightweight Encoder. |
Y. Qian; M. Li; M. Wu; K. Yan; P. Wang; | icassp | 2026-05-04 |
| 62 | SAM Meets Mask2Former: A Segmoe-Hybrid Model for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose SaM3oe, a novel framework integrating SAM’s encoder with Mask2Former, augmented by a SAM adapter module and a plug-and-play Mixture of Experts (MoE) architecture tailored for segmentation, termed SegMoE. |
Y. Wang; Z. Zhao; X. Wang; S. Zhang; | icassp | 2026-05-04 |
| 63 | E2-CAM: Edge-Enhanced Prototype-Based Class Activation Map for Weakly Supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While prototype-based methods can partially expand the response region, the limited image-level supervision reduces the intra-class consistency of the generated prototypes, making them insufficient for accurate object segmentation. To address these limitations, we propose a novel weakly supervised segmentation method: edge-enhanced class activation map (E2-CAM). |
Z. Wang; P. Wang; J. Yu; P. Chen; W. Li; | icassp | 2026-05-04 |
| 64 | Distribution-Aware Data Curation for Semantic Segmentation Via Mixture of VMFS Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: They either perform unsupervised core-set selection or rely on active learning with iterative annotation to construct the subset de novo, rather than leveraging distribution-verified segmentation datasets as reference targets to initialize a curated pool. To address this, we propose a distribution-guided curation framework that explicitly selects subsets from a raw pool to mimic pixel-level feature distributions of a validated reference dataset. |
Z. Hu; K. Yang; A. Vijayalingam; Z. Dai; | icassp | 2026-05-04 |
| 65 | AlignCLIP: Mining and Aligning Multi-Scale Vision-Language Features For Zero-Shot Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To better exploit CLIP’s potential in zero-shot segmentation, we propose AlignCLIP, a one-stage framework that fuses multi-scale image context with textual features to improve segmentation. |
L. MEI et. al. | icassp | 2026-05-04 |
| 66 | Prototype Learning-Based Few-Shot Medical Image Segmentation Framework Using Clip-Aided U-Net Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Medical image segmentation is crucial for automated clinical diagnostics, but is often hindered by the scarcity of annotated data. To address this, we propose a novel few-shot learning-based segmentation framework that integrates the complementary strengths of U-Net, for precise boundary localization, and CLIP, for rich semantic generalization. |
S. Pani; B. Adhya; V. Gulvanskii; D. Kaplun; R. Sarkar; | icassp | 2026-05-04 |
| 67 | MAC-SAM: Mask-Aware Category-Guided Segment Anything Model for Interactive Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Third, text lacks the ability to provide precise object boundaries, limiting segmentation accuracy. To address these challenges, we propose MAC-SAM, which simplifies text input into a single-click selection through an MLLM-Assisted Category Label Pre-processing Strategy and introduces the Mask-Category Alignment Module (MCAM) and Object-Category Guider (OCG) to align visual and text modalities during training, enabling SAM to better perceive object boundaries. |
W. LAI et. al. | icassp | 2026-05-04 |
| 68 | A Framework For Text-To-Semantic Segmentation Map Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we propose a semantic segmentation map generation framework named TSeg, in which a low-to-high resolution strategy is designed for higher input consistency. |
X. Zheng; G. Jiang; S. Hou; W. Wang; | icassp | 2026-05-04 |
| 69 | DISCERN: Discrepancy Learning for Weakly Supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their performance often degrades in medical imaging due to insufficient consideration of medical characteristics, such as distributional discrepancies, ambiguous boundaries, and structural interference. To address these issues, we propose an innovative discrepancy learning model, DISCERN, which harnesses distribution discrepancies to enhance the localization of medical regions of interest. |
G. Su; | icassp | 2026-05-04 |
| 70 | One-Stage Semi-Supervised Semantic Segmentation for Anomaly Detection Via Consistency Regularization and Student-Teacher Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose SSAD (Semi-Supervised Semantic Segmentation-based Anomaly Detection), which incorporates segmentation techniques more suitable for the SSS scenario. |
Y. Nakanishi; | icassp | 2026-05-04 |
| 71 | B2Camo: Leveraging Background Cues For Parameter-Efficient Fine-Tuning In Open-Vocabulary Camouflaged Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address these, we introduce B2Camo, a novel framework that repurposes the background from an interference source into a critical cue for segmentation. |
D. GUI et. al. | icassp | 2026-05-04 |
| 72 | RIS-FUSION: Rethinking Text-Driven Infrared and Visible Image Fusion from The Perspective of Referring Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We observe that referring image segmentation (RIS) and text-driven fusion share a common objective: highlighting the object referred to by the text. Motivated by this, we propose RIS-FUSION, a cascaded framework that unifies fusion and RIS through joint optimization. |
S. Ma; C. Gong; X. Fan; Y. Ma; C. Jiang; | icassp | 2026-05-04 |
| 73 | A Text-Image Fusion Method with Data Augmentation Capabilities for Referring Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, common augmentations like rotation and flipping disrupt spatial alignment between image and text, weakening performance. To address this, we propose an early fusion framework that combines text and visual features before augmentation, preserving spatial consistency. |
S. Chai; | icassp | 2026-05-04 |
| 74 | Sam-Guided Multi-View Fusion for Weakly Supervised 3d Point Cloud Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel framework that augments weak 3D supervision with external cues from the 2D vision domain. |
Y. Qiao; | icassp | 2026-05-04 |
| 75 | Privacy-Concealing Cooperative Perception for BEV Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Privacy-Concealing Cooperation (PCC) framework for Bird’s Eye View (BEV) semantic segmentation. |
S. Wang; L. Li; M. Santos; G. Wang; | icassp | 2026-05-04 |
| 76 | LMS-Net: Light-weight Multi-object Segmentation Network for Laparoscopic Surgical Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JIAYI ZHANG et. al. | Biomed. Signal Process. Control. | |
| 77 | Multi-Dataset Cross-Domain Knowledge Distillation for Unified Medical Image Segmentation, Classification, and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a unified cross-domain transfer learning framework that leverages knowledge from multiple heterogeneous medical imaging datasets to improve performance across segmentation, classification, and object detection tasks. |
Ceausescu Ciprian-Mihai; Anghelina Ion-Marian; Alexe Dumitru-Bogdan; | arxiv-cs.CV | 2026-05-02 |
| 78 | Graph-based Semantic Calibration Network for Unaligned UAV RGBT Image Semantic Segmentation and A Large-scale Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, UAV RGBT semantic segmentation faces two coupled challenges: cross-modal spatial misalignment caused by sensor parallax and platform vibration, and severe semantic confusion among fine-grained ground objects under top-down aerial views. To address these issues, we propose a Graph-based Semantic Calibration Network (GSCNet) for unaligned UAV RGBT image semantic segmentation. |
Fangqiang Fan; Zhicheng Zhao; Xiaoliang Ma; Chenglong Li; Jin Tang; | arxiv-cs.CV | 2026-04-29 |
| 79 | Seeking Consensus: Geometric-Semantic On-the-Fly Recalibration for Open-Vocabulary Remote Sensing Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite notable advances, existing methods typically employ a static inference paradigm, overlooking the distinct distribution of each scene, resulting in semantic ambiguity in diverse land covers and incomplete foreground activation. Motivated by this, we propose Seeking Consensus, termed SeeCo, a plug-and-play framework to boost the performance of training-free OVSS models in remote sensing images, which recalibrates arbitrary OVSS models on-the-fly by seeking dual consensus: geometric consensus learning (GCL) through multi-view consistent observations and semantic consensus learning (SCL) via textual description adaptive calibration, which assists collaborative recalibration of visual and textual semantics. |
GUANCHUN WANG et. al. | arxiv-cs.CV | 2026-04-28 |
| 80 | A Novel Unsupervised Method for Tomato Image Segmentation Using Hue Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The primary objective of the present study is to propose a suitable segmentation approach. |
V.V. Waykule; D.S. Bormane; | International Journal of Drug Delivery Technology | 2026-04-27 |
| 81 | Diffusion Model As A Generalist Segmentation Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Concretely, we introduce DiGSeg (Diffusion Models as a Generalist Segmentation Learner), which repurposes a pretrained diffusion model into a unified segmentation framework. |
HAOXIAO WANG et. al. | arxiv-cs.CV | 2026-04-27 |
| 82 | Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current multi-modal approaches mainly focus on integrating complementary visual modalities, yet neglect the incorporating of non-visual textual data – a rich source of knowledge that can bridge semantic gaps between visual patterns and real-world concepts. To address this limitation, we propose TSMNet, a text supervised multi-modal open vocabulary semantic segmentation network that synergistically integrates textual supervision with visual representation for open-vocabulary semantic segmentation. |
JINKUN DAI et. al. | arxiv-cs.CV | 2026-04-27 |
| 83 | X2SAM: Any Segmentation in Images and Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce X2SAM, a unified segmentation MLLM that extends any-segmentation capabilities from images to videos. |
HAO WANG et. al. | arxiv-cs.CV | 2026-04-27 |
| 84 | Image Splicing and Semantic Segmentation–based Omni-directional Monitoring for Substation Equipment Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ning Chen; Xin Yang; Hongxin Chang; | Discover Artificial Intelligence | 2026-04-26 |
| 85 | SemiGDA: Generative Dual-distribution Alignment for Semi-Supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limits robust semantic representation learning and adaptive modeling of unlabeled data in scenarios with few labels. To address these limitations, we propose SemiGDA, a novel Generative Dual-distribution Alignment framework for semi-supervised medical image segmentation. |
Kaiwen Huang; Yi Zhou; Yizhe Zhang; Jingxiong Li; Tao Zhou; | arxiv-cs.CV | 2026-04-25 |
| 86 | A Review of The Evolution and Challenges of Few-shot Medical Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article systematically explores foundational concepts such as medical image segmentation methods, benchmark datasets, performance evaluation metrics, and main approaches for few-shot medical image segmentation. |
HONG ZHAO et. al. | PeerJ Computer Science | 2026-04-24 |
| 87 | Efficient Adaptation of Vision Foundation Model for High-Resolution Remote Sensing Image Segmentation Via Spatial-Frequency Modeling and Sparse Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we introduce a Spatial-Frequency Adapter (SF-Adapter) to improve backbone-level dense feature adaptation by jointly modeling global frequency responses and multiscale local spatial details in a lightweight bottleneck space. |
CHENLONG DING et. al. | Remote Sensing | 2026-04-24 |
| 88 | INSIGHT: Indoor Scene Intelligence from Geometric-Semantic Hierarchy Transfer for Public~Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents INSIGHT, a zero-target-domain-annotation pipeline that projects 2D image understanding into 3D metric space via registered RGB-D data. |
Alexander Nikitas Dimopoulos; Joseph Grasso; John Beltz; | arxiv-cs.CV | 2026-04-24 |
| 89 | Joint Spectral Image Reconstruction and Semantic Segmentation with Cooperative Unfolding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To make the two mutually reinforcing, we introduce the Cross-Aggregated Super-Token Attention (CASTA) mechanism to enhance the representation interactions between HSI reconstruction and semantic segmentation. |
Zijun He; Ping Wang; Xiaodong Wang; ChangChen ChangChen; Xin Yuan; | cvpr | 2026-04-21 |
| 90 | Discover, Segment, and Select: A Progressive Mechanism for Zero-shot Camouflaged Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, relying solely on MLLMs for camouflaged object discovery often leads to inaccurate localization, false positives, and missed detections. To address these issues, we propose the Discover-Segment-Select (DSS) mechanism, a three-stage framework that progressively refines the segmentation process. |
Yilong Yang; Jianxin Tian; Shengchuan Zhang; Liujuan Cao; | cvpr | 2026-04-21 |
| 91 | MatchMask: Mask-Centric Generative Data Augmentation for Label-Scarce Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose MatchMask, a novel mask-centric generative data augmentation approach tailored for label-scarce semantic segmentation. |
YUQI LIN et. al. | cvpr | 2026-04-21 |
| 92 | HOPS: Hierarchical Open-vocabulary Part Segmentation with Attention-Aware Filtering and Affinity-Guided Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing VLM-based methods face two challenges: (1) object over-segmentation, caused by overly broad semantic activations, and (2) part under-segmentation, resulting from weak fine-grained perception. To address these issues, we propose HOPS, a two-stage framework for hierarchical open-vocabulary part segmentation. |
XINLONG LI et. al. | cvpr | 2026-04-21 |
| 93 | CA-LoRA: Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, it often overfits and memorizes training data, limiting their ability to generate diverse and well-aligned samples. To overcome these issues, we propose Concept-Aware LoRA (CA-LoRA), a novel fine-tuning approach that selectively identifies and updates only the weights associated with necessary concepts (e.g., style or viewpoint) for domain alignment while preserving the pretrained knowledge of the T2I model to produce informative samples. |
MINHO PARK et. al. | cvpr | 2026-04-21 |
| 94 | VGGT-Segmentor: Geometry-Enhanced Cross-View Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While recent geometry-aware models like VGGT provide a strong foundation for feature alignment, we find they often fail at dense prediction tasks due to significant pixel-level projection drift, even when their internal object-level attention remains consistent. To bridge this gap, we introduce VGGT-Segmentor (VGGT-S), a framework that unifies robust geometric modeling with pixel-accurate semantic segmentation. |
YULU GAO et. al. | cvpr | 2026-04-21 |
| 95 | MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MedCLIPSeg, a novel framework that adapts CLIP for robust, data-efficient, and uncertainty-aware medical image segmentation. |
TAHA KOLEILAT et. al. | cvpr | 2026-04-21 |
| 96 | ReAttnCLIP: Training-Free Open-Vocabulary Remote Sensing Image Segmentation Via Re-defined Attention in CLIP Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current methods generate suboptimal representations when capturing the complex spatial hierarchies in remote sensing. We address this gap by optimizing CLIP’s 197×197 attention matrix through three key modifications: (1) substituting the 196×196 patch-to-patch submatrix with intermediate-layer feature similarities to preserve spatial structures; (2) prioritizing intermediate-layer attention for global-to-local (class-to-patch) token alignment to reduce classification interference; (3) disabling the \texttt{[CLS]} token’s self-attention to mitigate bias. |
Xin Niu; Manqi Zhao; Dongsheng Jiang; Yingying Wu; Bing Su; | cvpr | 2026-04-21 |
| 97 | Scene-VLM: Multimodal Video Scene Segmentation Via Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present Scene-VLM, the first fine-tuned vision-language model (VLM) framework for video scene segmentation. |
NIMROD BERMAN et. al. | cvpr | 2026-04-21 |
| 98 | Hilbert Curve-Based Attention Enabling Topology-Preserving Image Tensor Representation for Semantic Segmentation Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Drone-based building defect segmentation remains challenging due to complex surface textures and illumination variations. We propose TPSegformer, a topology-preserving segmentation framework that mitigates mis-segmentation in such scenarios. |
Linkang Xu; Gang Li; Yue Song; Xiangxin Ji; | cvpr | 2026-04-21 |
| 99 | Making Training-Free Diffusion Segmentors Scale with The Generative Power Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: (ii) Even when a global map is available, it does not directly translate to accurate semantic correlation for segmentation, due to score imbalances among different text tokens. To bridge these gaps, we propose two techniques: auto aggregation and per-pixel rescaling, which together enable training-free segmentation to better leverage model capability. |
BENYUAN MENG et. al. | cvpr | 2026-04-21 |
| 100 | Scene-Centric Unsupervised Video Panoptic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose CUViPS, the first unsupervised VPS approach. |
CHRISTOPH REICH et. al. | cvpr | 2026-04-21 |
| 101 | EVObject: Learning Evolving Object-centric Representations for 3D Instance Segmentation Without Scene Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce EVObject for unsupervised 3D instance segmentation that bridges the geometric domain gap between synthetic pretraining data and real-world point clouds. |
JIAHAO CHEN et. al. | cvpr | 2026-04-21 |
| 102 | The Missing Point in Vision Transformers for Universal Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce ViT-P, a novel two-stage segmentation framework that decouples mask generation from classification. |
SAJJAD SHAHABODINI et. al. | cvpr | 2026-04-21 |
| 103 | SLARM: Streaming and Language-Aligned Reconstruction Model for Dynamic Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose SLARM, a feed-forward model that unifies dynamic scene reconstruction, semantic understanding, and real-time streaming inference. |
ZHICHENG QIU et. al. | cvpr | 2026-04-21 |
| 104 | Task-Oriented Data Synthesis and Control-Rectify Sampling for Remote Sensing Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the complexity of semantic mask control and the uncertainty of sampling quality often limit the utility of synthetic data in downstream semantic segmentation tasks. To address these challenges, we propose a task-oriented data synthesis framework (TODSynth), including a Multimodal Diffusion Transformer (MM-DiT) with unified triple attention and a plug-and-play sampling strategy guided by task feedback. |
YUNKAI YANG et. al. | cvpr | 2026-04-21 |
| 105 | Open-Vocabulary Domain Generalization in Urban-Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, these models remain sensitive to domain shifts and struggle to maintain robustness when deployed in unseen environments, a challenge that is particularly severe in urban-driving scenarios. To bridge this gap, we introduce Open-Vocabulary Domain Generalization in Semantic Segmentation (OVDG-SS), a new setting that jointly addresses unseen domains and unseen categories. |
DONG ZHAO et. al. | cvpr | 2026-04-21 |
| 106 | CDICS: Delving Into Fine-Grained Attribute for In-Context Segmentation Via Compositional Prompts and Phased Decoupling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, finding a perfectly matching single example for real-world rare and complex concepts is difficult;and existing methods are largely confined to semantic or instance-level understanding of the reference image, struggle to express more precise segmentation needs through the input. To address this, we propose \textbf{CDICS}, a novel framework that leverages \textbf{C}ompositional prompts and phased task \textbf{D}ecoupling to achieve compositional prompt-controlled \textbf{I}n-\textbf{C}ontext \textbf{S}egmentation. |
ZHIYU LI et. al. | cvpr | 2026-04-21 |
| 107 | HySeg: Learning Generative Priors for Structure-Aware Remote Sensing Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This imbalance leads to fragmented boundaries, texture overfitting, and poor cross-domain generalization. We address this challenge by reformulating RSISS as posterior inference grounded in generative structural priors, introducing {\bf HySeg}, a hybrid generative–discriminative segmentation paradigm that learns structure-consistent priors through generative modeling and guides posterior inference for remote sensing segmentation. |
JIE QIU et. al. | cvpr | 2026-04-21 |
| 108 | Seeing Both Sides: Towards Bidirectional Semantic Alignment for Open-Vocabulary Camouflaged Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such a design neglects the bidirectional interaction between visual and language modalities, making the model vulnerable to the semantic gap between image-level textual semantics and pixel-level segmentation cues, which in turn leads to severe semantic confusion in complex camouflaged scenarios. To address this challenge, we propose BaCLIP, a novel bidirectional semantic alignment framework for OVCOS. |
GUOHUI ZHANG et. al. | cvpr | 2026-04-21 |
| 109 | Feasibility of Indoor Frame-Wise Lidar Semantic Segmentation Via Distillation from Visual Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The success of such distillation has been shown for autonomous driving scenes, but not yet for indoor scenes. Here, we study the feasibility of repeating this success for indoor scenes, in a frame-wise distillation manner by coupling each lidar scan with a VFM-processed camera image. |
Haiyang Wu; Juan J. Gonzales Torres; George Vosselman; Ville Lehtola; | arxiv-cs.CV | 2026-04-20 |
| 110 | T-REN: Learning Text-Aligned Region Tokens Improves Dense Vision-Language Alignment and Scalability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose T-REN (Text-aligned Region Encoder Network), an efficient encoder that maps visual data to a compact set of text-aligned region-level representations (or region tokens). |
Savya Khosla; Sethuraman T; Aryan Chadha; Alex Schwing; Derek Hoiem; | arxiv-cs.CV | 2026-04-20 |
| 111 | Instant Colorization of Gaussian Splats Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We demonstrate the effectiveness of our approach on scene relighting, feature enrichment and 3D semantic segmentation tasks, achieving up to an order of magnitude speedup compared to gradient descent-based baselines. |
Daniel Lieber; Alexander Mock; Nils Wandel; | arxiv-cs.CV | 2026-04-18 |
| 112 | SENSE: Stereo OpEN Vocabulary SEmantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By incorporating stereo image pairs, we introduce geometric cues that improve spatial reasoning and segmentation accuracy. |
Thomas Campagnolo; Ezio Malis; Philippe Martinet; Gaétan Bahl; | arxiv-cs.CV | 2026-04-17 |
| 113 | Towards Realistic Open-Vocabulary Remote Sensing Segmentation: Benchmark and Baseline Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our previous \textit{OVRSISBenchV1} established an initial cross-dataset evaluation protocol, but its limited scope is insufficient for assessing realistic open-world generalization. To address this issue, we propose \textit{OVRSISBenchV2}, a large-scale and application-oriented benchmark for OVRSIS. |
BINGYU LI et. al. | arxiv-cs.CV | 2026-04-16 |
| 114 | HyperLiDAR: Adaptive Post-Deployment LiDAR Segmentation Via Hyperdimensional Computing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, edge systems operate under strict computational and energy constraints, making it infeasible to adapt conventional segmentation models (based on large neural networks) directly on-device. To address the above challenges, we introduce HyperLiDAR, the first lightweight, post-deployment LiDAR segmentation framework based on Hyperdimensional Computing (HDC). |
IVANNIA GOMEZ MORENO et. al. | arxiv-cs.CV | 2026-04-14 |
| 115 | Right Regions, Wrong Labels: Semantic Label Flips in Segmentation Under Correlation Shift Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that a model may achieve reasonable overlap while assigning the wrong semantic label, swapping one plausible foreground class for another, even when object boundaries are largely correct. We focus on this semantic label-flip behaviour and quantify it with a simple diagnostic (Flip) that counts how often ground truth foreground pixels are assigned the wrong foreground identity while remaining predicted as foreground. |
AKSHIT ACHARA et. al. | arxiv-cs.CV | 2026-04-14 |
| 116 | Restoration Adaptation for Semantic Segmentation on Low Quality Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose Restoration Adaptation for Semantic Segmentation (RASS), which effectively integrates semantic image restoration into the segmentation process, enabling high-quality semantic segmentation on the LQ images directly. |
KAI GUAN et. al. | International Journal of Computer Vision | 2026-04-14 |
| 117 | Coronary Angiography Image Segmentation and Atherosclerotic Plaque Auxiliary Detection System Using An Improved Pyramid Scene Parsing Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the performance bottlenecks of existing deep learning methods in handling complex vascular topology, small branches, and low-contrast plaques, this study proposes a deep learning-assisted diagnostic system based on an enhanced Pyramid Scene Parsing Network (PSPNet). |
Chao-min Li; Yang Yang; Yuanbo Li; Jiaqi Zhang; Wei Luo; | Journal of Mechanics in Medicine and Biology | 2026-04-14 |
| 118 | Do Instance Priors Help Weakly Supervised Semantic Segmentation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Semantic segmentation requires dense pixel-level annotations, which are costly and time-consuming to acquire. To address this, we present SeSAM, a framework that uses a foundational segmentation model, i.e. Segment Anything Model (SAM), with weak labels, including coarse masks, scribbles, and points. |
Anurag Das; Anna Kukleva; Xinting Hu; Yuki M. Asano; Bernt Schiele; | arxiv-cs.CV | 2026-04-13 |
| 119 | GS4City: Hierarchical Semantic Gaussian Splatting Via City-Model Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present GS4City, a hierarchical semantic Gaussian Splatting method that incorporates city-model priors for urban scene understanding. |
Qilin Zhang; Jinyu Zhu; Olaf Wysocki; Benjamin Busam; Boris Jutzi; | arxiv-cs.CV | 2026-04-13 |
| 120 | TAMISeg: Text-Aligned Multi-scale Medical Image Segmentation with Semantic Encoder Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose TAMISeg, a text-guided segmentation framework that incorporates clinical language prompts and semantic distillation as auxiliary semantic cues to enhance visual understanding and reduce reliance on pixel-level fine-grained annotations. |
QIANG GAO et. al. | arxiv-cs.CV | 2026-04-12 |
| 121 | Real-Time Satellite Image Segmentation Using YOLOv8 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The project focuses on real-time satellite image segmentation using YOLOv8 for detecting and classifying land regions into three categories: Agriculture Lands, Water Bodies, and Urban Buildings. |
Mrs. R. Prathyusha; | International Journal for Research in Applied Science and … | 2026-04-11 |
| 122 | Image Segmentation Techniques in Computer Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, this paper aims to analyze the development of image segmentation algorithms, starting from transformer networks and self-supervised learning models to vision-language models and foundation models. |
Gaurav Sahu; Goldi Soni; Rahul Sharma; | INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING … | 2026-04-10 |
| 123 | SwinTextUNet: Integrating CLIP-Based Text Guidance Into Swin Transformer U-Nets for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional models that rely solely on visual features often struggle when confronted with ambiguous or low contrast patterns. To overcome these limitations, we introduce SwinTextUNet, a multimodal segmentation framework that incorporates Contrastive Language Image Pretraining (CLIP), derived textual embeddings into a Swin Transformer UNet backbone. |
Ashfak Yeafi; Parthaw Goswami; Md Khairul Islam; Ashifa Islam Shamme; | arxiv-cs.CV | 2026-04-10 |
| 124 | PanoSAM2: Lightweight Distortion- and Memory-aware Adaptions of SAM2 for 360 Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose PanoSAM2, a novel 360VOS framework based on our lightweight distortion- and memory-aware adaptation strategies of SAM2 to achieve reliable 360VOS while retaining SAM2’s user-friendly prompting design. |
Dingwen Xiao; Weiming Zhang; Shiqi Wen; Lin Wang; | arxiv-cs.CV | 2026-04-09 |
| 125 | ModuSeg: Decoupling Object Discovery and Semantic Retrieval for Training-Free Weakly Supervised Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although foundation models show immense potential, many approaches still follow the tightly coupled optimization paradigm, struggling to effectively alleviate pseudo-label noise and often relying on time-consuming multi-stage retraining or unstable end-to-end joint optimization. To address the above challenges, we present ModuSeg, a training-free weakly supervised semantic segmentation framework centered on explicitly decoupling object discovery and semantic assignment. |
Qingze He; Fagui Liu; Dengke Zhang; Qingmao Wei; Quan Tang; | arxiv-cs.CV | 2026-04-08 |
| 126 | PASTA: Vision Transformer Patch Aggregation for Weakly Supervised Target and Anomaly Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing perception systems frequently fail to satisfy the strict operational requirements of these domains, specifically real-time processing, pixel-level segmentation precision, and robust accuracy, due to their reliance on exhaustively annotated datasets. To address these limitations, we propose a weakly supervised pipeline for object segmentation and classification using weak image-level supervision called ‘Patch Aggregation for Segmentation of Targets and Anomalies’ (PASTA). |
Melanie Neubauer; Elmar Rueckert; Christian Rauch; | arxiv-cs.CV | 2026-04-07 |
| 127 | Semantic Segmentation of Multispectral Satellite Images Using Residual Convolutional Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to improve the generalization ability of semantic segmentation algorithms, a combined model of UNet_ResNet is used in this paper. |
Abhinav Chandra; Anuradha Chetan Phadke; Vaidehi Deshmukh; | International Journal of Image, Graphics and Signal … | 2026-04-03 |
| 128 | Comparative Performance Analysis of U-Net and DeepLabV3+ for Semantic Segmentation in Traffic Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes a new method for processing poor quality traffic images by sequentially applying super-resolution (SR), semantic segmentation (SS), and object detection with YOLOv8x. |
S. Rai Utsavi; S. Raghavendra; B. N. Anoop; P. S. Venugopala; | Scientific Reports | 2026-04-01 |
| 129 | Advancing Complex Video Object Segmentation Via Tracking-Enhanced Prompt: The 1st Winner for 5th PVUW MOSE Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The root cause of this limitation lies in SAM3’s insufficient comprehension of these specific target types. To address this issue, we propose TEP: Advancing Complex Video Object Segmentation via Tracking-Enhanced Prompts. |
JINRONG ZHANG et. al. | arxiv-cs.CV | 2026-03-31 |
| 130 | Exponentially-Krill Herd Algorithm-based Hybrid Deep Architecture for Semantic Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
S. Kanimozhi; Selvamani K.; Amol Dattatray Dhaygude; Vamsidhar Talasila; | Statistics | 2026-03-31 |
| 131 | TreeGaussian: Tree-Guided Cascaded Contrastive Learning for Hierarchical Consistent 3D Gaussian Scene Segmentation and Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, dense pairwise comparisons and inconsistent hierarchical labels from 2D priors hinder feature learning, resulting in suboptimal segmentation. To address these limitations, we introduce TreeGaussian, a tree-guided cascaded contrastive learning framework that explicitly models hierarchical semantic relationships and reduces redundancy in contrastive supervision. |
JINGBIN YOU et. al. | arxiv-cs.CV | 2026-03-31 |
| 132 | BGSC-Net: Boundary-guided Semantic Compensation Network for Remote Sensing Image Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, during the decoding stage, the lack of explicit boundary guidance frequently causes the loss of edge information during feature reconstruction, compromising the delineation of object contours in intricate environments. To address these issues, We propose a novel hybrid architecture named Boundary-Guided Semantic Compensation Network (BGSC-Net). |
XIN WANG et. al. | PLOS One | 2026-03-31 |
| 133 | Semantic Segmentation Network for Motorway Point Clouds Considering Reflectance and Density Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The MLS point cloud’s semantic segmentation relies primarily on manual, repetitive labour and time-consuming work. To address the above problems, we propose a deep learning network guided by an unsupervised scene simplification module for the semantic segmentation task of motorway MLS point cloud data. |
Peng Cheng; Zhongliang Cai; Wencai Si; Bozhao Lee; | Engineering Research Express | 2026-03-31 |
| 134 | Unified Restoration-Perception Learning: Maritime Infrared-Visible Image Fusion and Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The framework includes a Frequency-Spatial Enhancement Complementary (FSEC) module for degradation suppression and structural enhancement, a Semantic-Visual Consistency Attention (SVCA) module for semantic-consistent guidance, and a cross-modality guided attention mechanism for selective fusion. |
WEICHAO CAI et. al. | arxiv-cs.CV | 2026-03-30 |
| 135 | Camera-Based Semantic Image Segmentation for Autonomous Vehicle Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This review offers insightful analysis and useful suggestions for maximising and utilising SAM2 in practical situations |
Aparna Tiwari; Shubham Bilgi; Anuja Chaudhari; | International Journal of Science, Strategic Management and … | 2026-03-30 |
| 136 | SegRGB-X: General RGB-X Semantic Segmentation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Semantic segmentation across arbitrary sensor modalities faces significant challenges due to diverse sensor characteristics, and the traditional configurations for this task result in redundant development efforts. We address these challenges by introducing a universal arbitrary-modal semantic segmentation framework that unifies segmentation across multiple modalities. |
JIONG LIU et. al. | arxiv-cs.CV | 2026-03-30 |
| 137 | Cloud–Edge Collaborative Object Detection and Semantic Segmentation in Intelligent Transportation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ting Cao; Wenbin Li; Haoxuan Xu; Yuhang Wang; Penghui Wang; | Journal of Grid Computing | 2026-03-30 |
| 138 | Detection of Adversarial Attacks in Robotic Perception Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deep Neural Networks (DNNs) achieve strong performance in semantic segmentation for robotic perception but remain vulnerable to adversarial attacks, threatening safety-critical … |
Ziad Sharawy; Mohammad Nakshbandiand; Sorin Mihai Grigorescu; | arxiv-cs.CV | 2026-03-30 |
| 139 | Progressive Prompt-Guided Cross-Modal Reasoning for Referring Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods either rely on cross-modal alignment or employ Semantic Segmentation Prompts, but they often lack explicit reasoning mechanisms for grounding language descriptions to target regions in the image. To address these limitations, we propose PPCR, a Progressive Prompt-guided Cross-modal Reasoning framework for referring image segmentation. |
JIACHEN LI et. al. | arxiv-cs.CV | 2026-03-29 |
| 140 | Automated Landscape Element Recognition and Layout Optimization Based on Image Segmentation and Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Hui Zhang; Nana Tang; Jiehua Sun; | Scientific Reports | 2026-03-27 |
| 141 | A Novel Deep Learning-based Object Detection Along Semantic Segmentation on Aerial Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These findings validate the framework’s superior performance in UAV image analysis. In conclusion, the proposed multimodal approach provides a robust solution for aerial object recognition, and we recommend exploring lightweight model variants and self-supervised learning to further enhance its deployment potential. |
MAHA ABDELHAQ et. al. | PeerJ Computer Science | 2026-03-24 |
| 142 | LM-UNet: Lightweight Mamba-UNet Prostate MRI Image Segmentation Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the traditional UNet segmentation network has low segmentation accuracy because of the fuzzy boundary and low contrast. Therefore, we propose a Lightweight Mamba-UNet (LM-UNet) prostate MRI image segmentation method. |
KUNCAI XU et. al. | PLOS One | 2026-03-23 |
| 143 | Look, Listen and Segment: Towards Weakly Supervised Audio-visual Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Weakly Supervised Audio-Visual Semantic Segmentation (WSAVSS), which uses only video-level labels to generate per-frame semantic masks of sounding objects. |
Chengzhi Li; Heyan Huang; Ping Jian; Yanghao Zhou; | arxiv-cs.MM | 2026-03-23 |
| 144 | Ordinal Semantic Segmentation Applied to Medical and Odontological Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, loss functions that incorporate ordinal relationships into deep neural networks are investigated to promote greater semantic consistency in semantic segmentation tasks. |
Mariana Dória Prata Lima; Gilson Antonio Giraldi; Jaime S. Cardoso; | arxiv-cs.CV | 2026-03-21 |
| 145 | MagicSeg: Open-World Segmentation Pretraining Via Counterfactural Diffusion-Based Auto-Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In light of the formidable image generation capabilities of diffusion models, we introduce a novel diffusion model-driven pipeline for automatically generating datasets tailored to the needs of open-world semantic segmentation, named MagicSeg. |
KAIXIN CAI et. al. | arxiv-cs.CV | 2026-03-19 |
| 146 | DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing tokenizers are designed for monocular and 2D scenes, leading to inefficiency and inter-view inconsistency when applied to high-resolution multi-view driving scenes. To address this, we propose DriveTok, an efficient 3D driving scene tokenizer for unified multi-view reconstruction and understanding. |
DONG ZHUO et. al. | arxiv-cs.CV | 2026-03-19 |
| 147 | SegFly: A 2D-3D-2D Paradigm for Aerial RGB-Thermal Semantic Segmentation at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Semantic segmentation for uncrewed aerial vehicles (UAVs) is fundamental for aerial scene understanding, yet existing RGB and RGB-T datasets remain limited in scale, diversity, and annotation efficiency due to the high cost of manual labeling and the difficulties of accurate RGB-T alignment on off-the-shelf UAVs. To address these challenges, we propose a scalable geometry-driven 2D-3D-2D paradigm that leverages multi-view redundancy in high-overlap aerial imagery to automatically propagate labels from a small subset of manually annotated RGB images to both RGB and thermal modalities within a unified framework. |
MARKUS GROSS et. al. | arxiv-cs.CV | 2026-03-18 |
| 148 | R&D: Balancing Reliability and Diversity in Synthetic Data Augmentation for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel synthetic data augmentation pipeline that integrates controllable diffusion models. |
Huy Che; Dinh-Duy Phan; Duc-Khai Lam; | arxiv-cs.CV | 2026-03-18 |
| 149 | Poisoning The Pixels: Revisiting Backdoor Attacks on Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we revisit the threats by systematically examining backdoor attacks tailored to semantic segmentation. |
GUANGSHENG ZHANG et. al. | arxiv-cs.CR | 2026-03-17 |
| 150 | DesertFormer: Transformer-Based Semantic Segmentation for Off-Road Desert Terrain Classification in Autonomous Navigation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present DesertFormer, a semantic segmentation pipeline for off-road desert terrain analysis based on SegFormer B2 with a hierarchical Mix Transformer (MiT-B2) backbone. |
Yasaswini Chebolu; | arxiv-cs.CV | 2026-03-17 |
| 151 | A Few-shot Learning for Image Semantic Segmentation with Weak Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Josh Jia-Ching Ying; Jin-Qun Liao; Ji Zhang; | Engineering Applications of Artificial Intelligence | 2026-03-16 |
| 152 | DCP-CLIP:A Coarse-to-Fine Framework for Open-Vocabulary Semantic Segmentation with Dual Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The recent years have witnessed the remarkable development for open-vocabulary semantic segmentation (OVSS) using visual-language foundation models, yet still suffer from following fundamental challenges: (1) insufficient cross-modal communications between textual and visual spaces, and (2) significant computational costs from the interactions with massive number of categories. To address these issues, this paper describes a novel coarse-to-fine framework, called DCP-CLIP, for OVSS. |
JING WANG et. al. | arxiv-cs.CV | 2026-03-14 |
| 153 | Spatio-Semantic Expert Routing Architecture with Mixture-of-Experts for Referring Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Because of this mismatch, predictions often contain fragmented regions, inaccurate boundaries, or even the wrong object, especially when pretrained backbones are frozen for computational efficiency. To address these limitations, we propose SERA, a Spatio-Semantic Expert Routing Architecture for referring image segmentation. |
Alaa Dalaq; Muzammil Behzad; | arxiv-cs.CV | 2026-03-12 |
| 154 | An Edge-guided Dictionary-learning Segmentation Network for Polymetallic Nodules Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In addition, the large variation in nodule scale makes small nodules difficult to accurately identify and classify. To address these issues, we propose an edge-guided dictionary-learning segmentation network for polymetallic nodules, enabling category-aware precise segmentation. |
MENG ZHAO et. al. | Measurement Science and Technology | 2026-03-11 |
| 155 | From Semantics to Pixels: Coarse-to-Fine Masked Autoencoders for Hierarchical Visual Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Self-supervised visual pre-training methods face an inherent tension: contrastive learning (CL) captures global semantics but loses fine-grained detail, while masked image modeling (MIM) preserves local textures but suffers from attention drift due to semantically-agnostic random masking. We propose C2FMAE, a coarse-to-fine masked autoencoder that resolves this tension by explicitly learning hierarchical visual representations across three data granularities: semantic masks (scene-level), instance masks (object-level), and RGB images (pixel-level). |
WENZHAO XIANG et. al. | arxiv-cs.CV | 2026-03-10 |
| 156 | Efficient RGB-D Scene Understanding Via Multi-task Adaptive Learning and Cross-dimensional Feature Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional approaches often face challenges, including occlusions, ambiguous boundaries, and the inability to adapt attention based on task-specific requirements and sample variations. To address these limitations, this paper presents an efficient RGB-D scene understanding model that performs a range of tasks, including semantic segmentation, instance segmentation, orientation estimation, panoptic segmentation, and scene classification. |
Guodong Sun; Junjie Liu; Gaoyang Zhang; Bo Wu; Yang Zhang; | arxiv-cs.CV | 2026-03-08 |
| 157 | Visualizing Coalition Formation: From Hedonic Games to Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose image segmentation as a visual diagnostic testbed for coalition formation in hedonic games. |
Pedro Henrique de Paula França; Lucas Lopes Felipe; Daniel Sadoc Menasché; | arxiv-cs.AI | 2026-03-08 |
| 158 | JOPP-3D: Joint Open Vocabulary Semantic Segmentation on Point Clouds and Panoramas Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present JOPP-3D, an open-vocabulary semantic segmentation framework that jointly leverages panoramic and point cloud data to enable language-driven scene understanding. |
SANDEEP INUGANTI et. al. | arxiv-cs.CV | 2026-03-06 |
| 159 | SEFF-Net: A Hybrid Feature Fusion Network for Accurate Segmentation of Breast Ultrasound Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, accurate lesion segmentation remains challenging because of severe speckle noise, low contrast, and blurred tumor boundaries. To address these issues, this paper proposes SEFF-Net, a novel edge-aware feature fusion network with a U-shaped encoder–decoder architecture to capture multi-level semantic representations for breast ultrasound image segmentation task. |
Tingli Su; Rui Wan; Senmao Wang; Yuting Bai; | ICCK Transactions on Emerging Topics in Artificial … | 2026-03-06 |
| 160 | Local-Contextual Feature Fusion Network Based on Nonlinear Spiking Neural Model for Semantic Segmentation of Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In order to effectively utilize local contextual features, a channel attention-feature fusion module using a novel nonlinear spiking neuron model is designed to assist the decoder in better feature recovery. |
Junhao Du; Hong Peng; Bing Li; Zhicai Liu; | International Journal of Neural Systems | 2026-03-06 |
| 161 | Rewis3d: Reconstruction Improves Weakly-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Rewis3d, a framework that leverages recent advances in feed-forward 3D reconstruction to significantly improve weakly supervised semantic segmentation on 2D images. |
Jonas Ernst; Wolfgang Boettcher; Lukas Hoyer; Jan Eric Lenssen; Bernt Schiele; | arxiv-cs.CV | 2026-03-06 |
| 162 | Semantic Class Distribution Learning for Debiasing Semi-Supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such imbalance causes minority structures to be overwhelmed by dominant classes in feature representations, hindering the learning of discriminative features and making reliable segmentation particularly challenging. To address this, we propose the Semantic Class Distribution Learning (SCDL) framework, a plug-and-play module that mitigates supervision and representation biases by learning structured class-conditional feature distributions. |
YINGXUE SU et. al. | arxiv-cs.CV | 2026-03-05 |
| 163 | TGR-T: Truncated-Gaussian-Weighted Reliability for Adaptive Dynamic Thresholding in Weakly Supervised Indoor 3D Point Cloud Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing weakly supervised methods commonly rely on fixed confidence thresholds for pseudo-label selection, which exhibit limited generalization caused by threshold sensitivity, underutilization of informative low-confidence regions, and progressive noise accumulation during self-training. To address these issues, we propose TGR-T, a weakly supervised framework for indoor 3D point cloud semantic segmentation that incorporates truncated-Gaussian-weighted reliability with adaptive dynamic thresholding. |
ZIWEI LUO et. al. | ISPRS International Journal of Geo-Information | 2026-03-04 |
| 164 | Downstream Task Inspired Underwater Image Enhancement: A Perception-Aware Study from Dataset Construction to Network Design Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing UIE methods mainly focus on enhancing images for human visual perception, frequently failing to reconstruct high-frequency details that are critical for task-specific recognition. To address this issue, we propose a Downstream Task-Inspired Underwater Image Enhancement (DTI-UIE) framework, which leverages human visual perception model to enhance images effectively for underwater vision tasks. |
Bosen Lin; Feng Gao; Yanwei Yu; Junyu Dong; Qian Du; | arxiv-cs.CV | 2026-03-02 |
| 165 | Benchmarking Semantic Segmentation Models Via Appearance and Geometry Attribute Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we construct an automatic data generation pipeline Gen4Seg to stress-test semantic segmentation models by generating various challenging samples with different attribute changes. |
ZIJIN YIN et. al. | arxiv-cs.CV | 2026-03-02 |
| 166 | MEWS: Semantic Image Segmentation with Multiclass Extreme Weak Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
A. Apostolidis; V. Mygdalis; M. Tzimas; I. Pitas; | Neurocomputing | 2026-03-01 |
| 167 | Multiphase Image Segmentation of Naturally Fractured Media: Benchmarking Deep Learning and Conventional Approaches Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate the segmentation accuracy of a convolutional neural network U-net and compare it with traditional methods of Watershed and multi-Otsu thresholding for multiphase segmentation of grayscale images from a naturally fractured coal sample. |
Behrad Tabrizipour; Saeid Sadeghnejad; Mastaneh Hajipour; Thorsten Schäfer; | InterPore Journal | 2026-03-01 |
| 168 | Semantic Segmentation Performance of Aerial Image Segmentation Using Weighted Ensemble Trained Networks CNNs Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Zahra Faska; Lahbib Khrissi; Khalid Haddouch; Nabil El Akkad; | Multimedia Tools and Applications | 2026-02-26 |
| 169 | MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MedCLIPSeg, a novel framework that adapts CLIP for robust, data-efficient, and uncertainty-aware medical image segmentation. |
TAHA KOLEILAT et. al. | arxiv-cs.CV | 2026-02-23 |
| 170 | WOW-Seg: A Word-free Open World Segmentation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To bridge discrepancies, we propose WOW-Seg, a Word-free Open World Segmentation model for segmenting and recognizing objects from open-set categories.We further construct an open world region recognition test benchmark: the Region Recognition Dataset (RR-7K). |
DANYANG LI et. al. | iclr | 2026-02-17 |
| 171 | Advancing Complex Video Object Segmentation Via Progressive Concept Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Segment Concept (SeC), a concept-driven video object segmentation (VOS) framework that shifts from conventional feature matching to the progressive construction and utilization of high-level, object-centric representations.To rigorously assess VOS methods in scenarios demanding high-level conceptual reasoning and robust semantic understanding, we introduce the Semantic Complex Scenarios Video Object Segmentation benchmark (SeCVOS). |
ZHIXIONG ZHANG et. al. | iclr | 2026-02-17 |
| 172 | Hierarchical Prototype Learning for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Conventional semantic segmentation methods often fail to distinguish fine-grained parts within the same object because of missing links between part-level cues and object-level semantics. Inspired by how humans recognize objects, which involves first identifying them as a whole and then distinguishing their parts, we propose a hierarchical prototype-based segmentation method called Hierarchical Prototype Segmentation (HiPoSeg). |
Seoha Lim; Jinmyeong Kim; Jieun Kim; Sung-Bae Cho; | iclr | 2026-02-17 |
| 173 | Object-Centric Refinement for Enhanced Zero-Shot Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This hinders the performance of the segmentation decoder, especially for unseen categories. To mitigate this issue, we propose object-centric zero-shot segmentation (OC-ZSS) that enhances patch representations using object-level information. |
Srinivasa Rao Nandam; Sara Atito; Zhenhua Feng; Josef Kittler; Muhammad Awais; | iclr | 2026-02-17 |
| 174 | Video Scene Segmentation with Genre and Duration Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel approach that incorporates production-level metadata, specifically genre conventions and shot duration patterns, into video scene segmentation. |
Jungu Cho; Seong Jong Ha; Hae-Gon Jeon; | iclr | 2026-02-17 |
| 175 | UCGM: Enhancing Pseudo Labels Via Uncertainty and Cross-image Gaussian Mixture Model for Semi-supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
ZHENYAN WANG et. al. | Expert Syst. Appl. | |
| 176 | Cross-view Domain Generalization Via Geometric Consistency for LiDAR Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Accordingly, we formulate cross-view domain generalization for LiDAR semantic segmentation and propose a novel framework, termed CVGC (Cross-View Geometric Consistency). |
JINDONG ZHAO et. al. | arxiv-cs.CV | 2026-02-16 |
| 177 | VIPA: Visual Informative Part Attention for Referring Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To more effectively exploit visual contexts for fine-grained segmentation, we propose a novel Visual Informative Part Attention (VIPA) framework for referring image segmentation. |
YUBIN CHO et. al. | arxiv-cs.CV | 2026-02-16 |
| 178 | A Hierarchical Multi-token Transformer for 3D Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
HUI LIN et. al. | International Journal of Machine Learning and Cybernetics | 2026-02-16 |
| 179 | DynaGuide: A Generalizable Dynamic Guidance Framework for Unsupervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods often struggle to reconcile global semantic structure with fine-grained boundary accuracy. This paper introduces DynaGuide, an adaptive segmentation framework that addresses these challenges through a novel dual-guidance strategy and dynamic loss optimization. |
Boujemaa Guermazi; Riadh Ksantini; Naimul Khan; | arxiv-cs.CV | 2026-02-13 |
| 180 | Privacy-Concealing Cooperative Perception for BEV Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel Privacy-Concealing Cooperation (PCC) framework for Bird’s Eye View (BEV) semantic segmentation. |
Song Wang; Lingling Li; Marcus Santos; Guanghui Wang; | arxiv-cs.CV | 2026-02-13 |
| 181 | Enhancing Scene Understanding Using RGB‐D Visuals and Deep Learning Segmentation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, we propose a model for scene comprehension using depth images only in indoor and outdoor settings. |
AYSHA NASEER et. al. | ETRI Journal | 2026-02-12 |
| 182 | DeepMixNet : Deep Multi‐Scale Interactive Feature Mixing Network for Automated Skin Lesion Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose DeepMixNet, a novel deep multi‐scale interactive feature fusion network specifically tailored for automated skin lesion segmentation. |
Ying Wang; XinYu Wang; Meng Zhang; Meiyan Liang; Jian’an Liang; | International Journal of Imaging Systems and Technology | 2026-02-09 |
| 183 | A Multi-Strategy Improved Dung Beetle Optimizer for The Kapur Entropy Multi-Threshold Image Segmentation Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the issues of detail loss and unstable segmentation quality in image segmentation, this paper proposes a multi-strategy improved dung beetle optimization algorithm be applied to multi-threshold image segmentation. |
Jinjin Li; Yecai Guo; Meiyu Liang; Haiyan Long; Tianfei Zhang; | Algorithms | 2026-02-09 |
| 184 | Geospatial-Reasoning-Driven Vocabulary-Agnostic Remote Sensing Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This “appearance-based paradigm lacks geospatial contextual awareness, leading to severe semantic ambiguity and misclassification when encountering land-cover classes with similar spectral features but distinct semantic attributes. To address this, we propose a Geospatial Reasoning Chain-of-Thought (GR-CoT) framework designed to enhance the scene understanding capabilities of Multimodal Large Language Models (MLLMs), thereby guiding open-vocabulary segmentation models toward precise mapping. |
Chufeng Zhou; Jian Wang; Xinyuan Liu; Xiaokang Zhang; | arxiv-cs.CV | 2026-02-08 |
| 185 | Fully Automated Volumetry of Ventricular Subregions on Computed Tomography Using Object Detection and Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
RAFFAELE DA MUTTEN et. al. | NeuroImage: Reports | 2026-02-06 |
| 186 | Seeing Roads Through Words: A Language-Guided Framework for RGB-T Driving Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: RGB-Thermal fusion is a standard approach, yet existing methods apply static fusion strategies uniformly across all conditions, allowing modality-specific noise to propagate throughout the network. Hence, we propose CLARITY that dynamically adapts its fusion strategy to the detected scene condition. |
Ruturaj Reddy; Hrishav Bakul Barua; Junn Yong Loo; Thanh Thi Nguyen; Ganesh Krishnasamy; | arxiv-cs.CV | 2026-02-06 |
| 187 | LoGoSeg: Integrating Local and Global Features for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing approaches lack strong object priors and region-level constraints, which can lead to object hallucination or missed detections, further degrading performance. To address these challenges, we propose LoGoSeg, an efficient single-stage framework that integrates three key innovations: (i) an object existence prior that dynamically weights relevant categories through global image-text similarity, effectively reducing hallucinations; (ii) a region-aware alignment module that establishes precise region-level visual-textual correspondences; and (iii) a dual-stream fusion mechanism that optimally combines local structural information with global semantic context. |
JUNYANG CHEN et. al. | arxiv-cs.CV | 2026-02-05 |
| 188 | GauComp: 3D Gaussian Completion for Associated Shadow and Object Removal Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a novel pipeline named GauComp, focusing on address the task of object removal in 3D Gaussian‐based scenes, with automatic removal of associated shadows. |
Wenxing Zheng; Yuqi Li; Jinghui Xiang; Xiaohao Peng; Chong Wang; | Computer Graphics Forum | 2026-02-04 |
| 189 | Seeing Through Clutter: Structured 3D Scene Reconstruction Via Iterative Object Removal Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present SeeingThroughClutter, a method for reconstructing structured 3D representations from single images by segmenting and modeling objects individually. |
Rio Aguina-Kang; Kevin James Blackburn-Matzen; Thibault Groueix; Vladimir Kim; Matheus Gadelha; | arxiv-cs.CV | 2026-02-03 |
| 190 | CSN: A Compact Semantic Segmentation Network for Visual Scene Perception in Assistive Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yunjia Lei; S. L. Phung; Yang Di; A. Bouzerdoum; | Comput. Vis. Image Underst. | 2026-02-01 |
| 191 | PointNeXt-DBSCAN: A Hybrid Point Cloud Deep Learning Framework for Multi-stage Cotton Leaf Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study addresses the challenge of organ-level instance segmentation in cotton point clouds, which arises from significant morphological variations and leaf occlusion across growth stages. |
Zeyu Lei; Debin Zeng; Liangfang Zheng; | Frontiers in Plant Science | 2026-01-29 |
| 192 | SHED Light on Segmentation for Dense Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose SHED, a novel encoder-decoder architecture that enforces geometric prior explicitly by incorporating segmentation into dense prediction. |
Seung Hyun Lee; Sangwoo Mo; Stella X. Yu; | arxiv-cs.CV | 2026-01-29 |
| 193 | Bidirectional Cross-Perception for Open-Vocabulary Semantic Segmentation in Remote Sensing Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing training-free open-vocabulary semantic segmentation (OVSS) methods typically fuse CLIP and vision foundation models (VFMs) using one-way injection and shallow post-processing strategies, making it difficult to satisfy these requirements. To address this issue, we propose a spatial-regularization-aware dual-branch collaborative inference framework for training-free OVSS, termed SDCI. |
Jianzheng Wang; Huan Ni; | arxiv-cs.CV | 2026-01-28 |
| 194 | Multi-scale Scene Graph Generation for Remote Sensing Imagery Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Abstract. The map, as a way of representing geospatial data, is designed to reflect important information about the Earth as deeply and accurately as possible. To meet this … |
Vladimir A. Knyaz; Vladimir V. Kniaz; Anton V. Emelyanov; Sergey Yu. Zheltov; Victor S. Aleksandrov; | The International Archives of the Photogrammetry, Remote … | 2026-01-27 |
| 195 | UCAD: Uncertainty-guided Contour-aware Displacement for Semi-supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing displacement strategies in semi-supervised segmentation only operate on rectangular regions, ignoring anatomical structures and resulting in boundary distortions and semantic inconsistency. To address these issues, we propose UCAD, an Uncertainty-Guided Contour-Aware Displacement framework for semi-supervised medical image segmentation that preserves contour-aware semantics while enhancing consistency learning. |
Chengbo Ding; Fenghe Tang; Shaohua Kevin Zhou; | arxiv-cs.CV | 2026-01-24 |
| 196 | VISTA-PATH: An Interactive Foundation Model for Pathology Image Segmentation and Quantitative Analysis in Computational Pathology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here we present VISTA-PATH, an interactive, class-aware pathology segmentation foundation model designed to resolve heterogeneous structures, incorporate expert feedback, and produce pixel-level segmentation that are directly meaningful for clinical interpretation. |
PEIXIAN LIANG et. al. | arxiv-cs.CV | 2026-01-23 |
| 197 | Reliable Brain Tumor Segmentation Based on Spiking Neural Networks with Efficient Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a reliable and energy-efficient framework for 3D brain tumor segmentation using spiking neural networks (SNNs). |
Aurora Pia Ghiardelli; Guangzhi Tang; Tao Sun; | arxiv-cs.CV | 2026-01-23 |
| 198 | TransEdgeNet: A Multi-Branch Edge-Aware CNN-Transformer Network for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the aforementioned issues, this paper proposes a novel parallel architecture called TransEdgeNet, which integrates CNN and Swin Transformers to fully leverage the complementarity of multimodal features and enhance edge information representation. |
Yanfei Peng; hongyou guo; Zuyi Chen; | Engineering Research Express | 2026-01-21 |
| 199 | Context Patch Fusion With Class Token Enhancement for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they often neglect the complex contextual dependencies among image patches, resulting in incomplete local representations and limited segmentation accuracy. To address these issues, we propose the Context Patch Fusion with Class Token Enhancement (CPF-CTE) framework, which exploits contextual relations among patches to enrich feature representations and improve segmentation. |
Yiyang Fu; Hui Li; Wangyu Wu; | arxiv-cs.CV | 2026-01-21 |
| 200 | SSR-SAM: Retrieval-Style Segment Anything Model for Semi-Supervised Ultra-High-Resolution Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose SSR-SAM, a retrieval-style semi-supervised segmentation framework tailored for UHR images. |
Shijie Li; Yiming Chen; Zhineng Chen; Kai Hu; Xieping Gao; | aaai | 2026-01-20 |
| 201 | DBGroup: Dual-Branch Point Grouping for Weakly Supervised 3D Semantic Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these approaches still encounter limitations, including labor-intensive annotation processes, high complexity, and reliance on expert annotators. To address these challenges, we propose DBGroup, a two-stage weakly supervised 3D instance segmentation framework that leverages scene-level annotations as a more efficient and scalable alternative. |
Xuexun Liu; Xiaoxu Xu; Qiudan Zhang; Lin Ma; Xu Wang; | aaai | 2026-01-20 |
| 202 | InfoCLIP: Bridging Vision-Language Pretraining and Open-Vocabulary Semantic Segmentation Via Information-Theoretic Alignment Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To stabilize modality alignment during fine-tuning, we propose InfoCLIP, which leverages an information-theoretic perspective to transfer alignment knowledge from pretrained CLIP to the segmentation task. |
MUYAO YUAN et. al. | aaai | 2026-01-20 |
| 203 | Toward Real-World High-Precision Image Matting and Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a Foreground Consistent Learning model, dubbed as FCLM, to address the aforementioned issues. |
HAIPENG ZHOU et. al. | aaai | 2026-01-20 |
| 204 | Diffusion-Based Contextual Reconstruction for Point Cloud Segmentation with Limited Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a diffusion-based contextual reconstruction framework for point cloud semantic segmentation with limited annotations. |
JIAWEI LIAN et. al. | aaai | 2026-01-20 |
| 205 | JoDiffusion: Jointly Diffusing Image with Pixel-Level Annotations for Semantic Segmentation Promotion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To migrate both problems with one stone, we present a novel dataset generative diffusion framework for semantic segmentation, termed JoDiffusion. |
HAOYU WANG et. al. | aaai | 2026-01-20 |
| 206 | Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces Symmetrical Flow Matching (SymmFlow), a new formulation that unifies semantic segmentation, classification, and image generation within a single model. |
Francisco Caetano; Christiaan Viviers; Peter H.N. de With; Fons van der Sommen; | aaai | 2026-01-20 |
| 207 | Learning 3D Texture-Aware Representations for Parsing Diverse Human Clothing and Body Parts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recent open-vocabulary segmentation approaches leverage pretrained text-to-image (T2I) diffusion model features for strong zero-shot transfer, but typically group entire humans into a single person category, failing to distinguish diverse clothing or detailed body parts. To address this, we propose Spectrum, a unified network for part-level pixel parsing (body parts and clothing) and instance-level grouping. |
Kiran Chhatre; Christopher E. Peters; Srikrishna Karanam; | aaai | 2026-01-20 |
| 208 | SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: (ii) Existing quantization reconstruction methods neglect semantic interactivity of SAM, leading to misalignment between image feature and prompt intention. To address the above issues, we propose SAQ-SAM in this paper, which boosts PTQ for SAM from the perspective of semantic alignment. |
Jing Zhang; Zhikai Li; Chengzhi Hu; Xuewen Liu; Qingyi Gu; | aaai | 2026-01-20 |
| 209 | MR-COSMO: Visual-Text Memory Recall and Direct CrOSs-MOdal Alignment Method for Query-Driven 3D Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The rapid advancement of vision-language models (VLMs) in 3D domains has accelerated research in text-query-guided point cloud processing, though existing methods underperform in point-level segmentation due to inadequate 3D-text alignment that limits local feature-text context linking. To address this limitation, we propose MR-COSMO, a Visual-Text Memory Recall and Direct CrOSs-MOdal Alignment Method for Query-Driven 3D Segmentation, establishing explicit alignment between 3D point clouds and text/2D image data through a dedicated direct cross-modal alignment module while implementing a visual-text memory module with specialized feature banks. |
Chade Li; Pengju Zhang; Yihong Wu; | aaai | 2026-01-20 |
| 210 | VideoSeg-R1:Reasoning Video Object Segmentation Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional video reasoning segmentation methods rely on supervised fine-tuning, which limits generalization to out-of-distribution scenarios and lacks explicit reasoning. To address this, we propose VideoSeg-R1, the first framework to introduce reinforcement learning into video reasoning segmentation. |
ZISHAN XU et. al. | aaai | 2026-01-20 |
| 211 | UniC-Lift: Unified 3D Instance Segmentation Via Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods use a two-stage approach in which some rely on contrastive learning with hyperparameter-sensitive clustering, while others preprocess labels for consistency. We propose a unified framework that merges these steps, reducing training time and improving performance by introducing a learnable feature embedding for segmentation in Gaussian primitives. |
Ankit Dhiman; Srinath R; Jaswanth Reddy; Lokesh R Boregowda; Venkatesh Babu Radhakrishnan; | aaai | 2026-01-20 |
| 212 | GaussianTrimmer: Online Trimming Boundaries for 3DGS Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose an online boundary trimming method, GaussianTrimmer, which is an efficient and plug-and-play post-processing method capable of trimming coarse boundaries for existing 3D Gaussian segmentation methods. |
Liwei Liao; Ronggang Wang; | arxiv-cs.CV | 2026-01-18 |
| 213 | Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here we present Medical SAM3, a foundation model for universal prompt-driven medical image segmentation, obtained by fully fine-tuning SAM3 on large-scale, heterogeneous 2D and 3D medical imaging datasets with paired segmentation masks and text prompts. |
CHONGCONG JIANG et. al. | arxiv-cs.CV | 2026-01-15 |
| 214 | Jordan-Segmentable Masks: A Topology-Aware Definition for Characterizing Binary Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a topology-aware notion of segmentation based on the Jordan Curve Theorem, and adapted for use in digital planes. |
Serena Grazia De Benedictis; Amedeo Altavilla; Nicoletta Del Buono; | arxiv-cs.CV | 2026-01-15 |
| 215 | SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation. |
XINYU XIONG et. al. | Visual Intelligence | 2026-01-13 |
| 216 | How Do Optical Flow and Textual Prompts Collaborate to Assist in Audio-Visual Semantic Segmentation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel collaborative framework, \textit{S}tepping \textit{S}tone \textit{P}lus (SSP), which integrates optical flow and textual prompts to assist the segmentation process. |
Peng Gao; Yujian Lee; Yongqi Xu; Wentao Fan; | arxiv-cs.CV | 2026-01-12 |
| 217 | HG-RSOVSSeg: Hierarchical Guidance Open-Vocabulary Semantic Segmentation Framework of High-Resolution Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we propose a multimodal feature aggregation module for pixel-level alignment and a hierarchical visual feature decoder guided by text feature alignment, which progressively refines visual features using language priors, preserving semantic coherence during high-resolution decoding. |
Wubiao Huang; Fei Deng; Huchen Li; Jing Yang; | Remote Sensing | 2026-01-09 |
| 218 | G2P: Gaussian-to-Point Attribute Alignment for Boundary-Aware 3D Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Gaussian-to-Point (G2P), which transfers appearance-aware attributes from 3D Gaussian Splatting to point clouds for more discriminative and appearance-consistent segmentation. |
HOJUN SONG et. al. | arxiv-cs.CV | 2026-01-06 |
| 219 | EarthVL: A Progressive Earth Vision-Language Understanding and Generation Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Three benchmarks, including semantic segmentation, multiple-choice, and open-ended VQA demonstrated the superiorities of EarthVLNet, yielding three future directions: 1) segmentation features consistently enhance VQA performance even in cross-dataset scenarios; 2) multiple-choice tasks show greater sensitivity to the vision encoder than to the language decoder; and 3) open-ended tasks necessitate advanced vision encoders and language decoders for an optimal performance. We believe this dataset and method will provide a beneficial benchmark that connects ”image-mask-text”, advancing geographical applications for Earth vision. |
JUNJUE WANG et. al. | arxiv-cs.CV | 2026-01-06 |
| 220 | Joint Semantic and Rendering Enhancements in 3D Gaussian Modeling with Anisotropic Local Encoding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a joint enhancement framework for 3D semantic Gaussian modeling that synergizes both semantic and rendering branches. |
Jingming He; Chongyi Li; Shiqi Wang; Sam Kwong; | arxiv-cs.CV | 2026-01-05 |
| 221 | Leveraging 2D-VLM for Label-Free 3D Segmentation in Large-Scale Outdoor Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a novel 3D semantic segmentation method for large-scale point cloud data that does not require annotated 3D training data or paired RGB images. |
Toshihiko Nishimura; Hirofumi Abe; Kazuhiko Murasaki; Taiga Yoshida; Ryuichi Tanida; | arxiv-cs.CV | 2026-01-05 |
| 222 | USformer: A U-Shaped Structure Transformer for RGB-Thermal Semantic Segmentation and Traffic Scene Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent advancements in multimodal approaches, particularly RGB-thermal (RGB-T) segmentation, have significantly promote the development of Intelligent Transportation Systems … |
Fan Yang; Feng Shao; Baoyang Mu; Xiongli Chai; Qiuping Jiang; | IEEE Transactions on Intelligent Transportation Systems | 2026-01-01 |
| 223 | Hybrid Transfer Semantic Segmentation Architecture for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Huaiping Yan; Yupeng Hou; Chengcai Leng; Yilin Li; Yang Li; | Digit. Signal Process. | 2026-01-01 |
| 224 | Extreme Weakly Supervised Binary Semantic Image Segmentation Via One-pixel Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
M. Tzimas; V. Mygdalis; Christos Papaioannidis; Ioannis Pitas; | Pattern Recognit. | 2026-01-01 |
| 225 | 3D Semantic Segmentation for Post-Disaster Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While 3D semantic segmentation is crucial for post-disaster assessment, existing deep learning models lack datasets specifically designed for post-disaster environments. To address this gap, we constructed a specialized 3D dataset using unmanned aerial vehicles (UAVs)-captured aerial footage of Hurricane Ian (2022) over affected areas, employing Structure-from-Motion (SfM) and Multi-View Stereo (MVS) techniques to reconstruct 3D point clouds. |
Nhut Le; Maryam Rahnemoonfar; | arxiv-cs.CV | 2025-12-30 |
| 226 | BATISNet: Instance Segmentation of Tooth Point Clouds with Boundary Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, due to the tightly packed structure of teeth, unclear boundaries, and the diversity of complex cases such as missing teeth, malposed teeth, semantic segmentation often struggles to achieve satisfactory results when dealing with complex dental cases. To address these issues, this paper propose BATISNet, a boundary-aware instance network for tooth point cloud segmentation. |
Yating Cai; Yanghui Xu; Zehua Hu; Jiazhou Chen; Jing Huang; | arxiv-cs.GR | 2025-12-30 |
| 227 | PASENet: Snowy Scene 3D Object Detection With Pillar‐Wise Attention and Semantic Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: ABSTRACT LiDAR based 3D object detection plays an essential role in autonomous driving, yet snowy conditions degrade point cloud quality by introducing false returns and causing occlusion of objects, degrading the detection accuracy of existing 3D object detection algorithms. To overcome these challenges, we propose pillar‐wise attention and semantic enhancement network (PASENet), an end‐to‐end network specifically designed for snowy scene. |
Yutian Wu; Wenwei Sun; Zuodong Zhong; Qing Li; | IET Image Processing | 2025-12-30 |
| 228 | SwinTF3D: A Lightweight Multimodal Fusion Approach for Text-Guided 3D Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The lack of semantic understanding in these models makes them ineffective in addressing flexible, user-defined segmentation objectives. To overcome these limitations, we propose SwinTF3D, a lightweight multimodal fusion approach that unifies visual and linguistic representations for text-guided 3D medical image segmentation. |
Hasan Faraz Khan; Noor Fatima; Muzammil Behzad; | arxiv-cs.CV | 2025-12-28 |
| 229 | Scene-VLM: Multimodal Video Scene Segmentation Via Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present Scene-VLM, the first fine-tuned vision-language model (VLM) framework for video scene segmentation. |
NIMROD BERMAN et. al. | arxiv-cs.CV | 2025-12-25 |
| 230 | Surgical Scene Segmentation Using A Spike-Driven Video Transformer with Real-Time Potential Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose \textit{SpikeSurgSeg}, the first spike-driven video Transformer framework tailored for surgical scene segmentation with real-time potential on non-GPU platforms. |
SHIHAO ZOU et. al. | arxiv-cs.CV | 2025-12-24 |
| 231 | A Dual-Path Fusion Network with Edge Feature Enhancement for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a Dual-path Feature-enhanced Fusion Network (DPF-Net) for medical image segmentation to address limitations in existing methods, including insufficient edge feature extraction, semantic gaps among multi-scale encoder features, and significant semantic disparities between the encoder and decoder in U-Net architectures. |
Liangxu Shi; Weiyuan He; Guodong Wang; | Mathematics | 2025-12-24 |
| 232 | PGMNet: A Polyp Segmentation Network Based on Bit-Plane Slicing and Multi-Scale Adaptive Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate detection and segmentation of polyps during colonoscopy are of great significance for the early prevention and treatment of colorectal cancer. However, due to the … |
Dong Wang; ShanLin Liu; Shuai Li; HaiSha Liu; YuLingHeng Wang; | Biomedical Physics & Engineering Express | 2025-12-22 |
| 233 | VOIC: Visible-Occluded Decoupling for Monocular 3D Semantic Scene Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This strategy purifies the supervisory space for two complementary sub-tasks: visible-region perception and occluded-region reasoning. Building on this idea, we propose the Visible-Occluded Interactive Completion Network (VOIC), a novel dual-decoder framework that explicitly decouples SSC into visible-region semantic perception and occluded-region scene completion. |
Zaidao Han; Risa Higashita; Jiang Liu; | arxiv-cs.CV | 2025-12-21 |
| 234 | IndiVNet A Region Adaptive Semantic Image Segmentation for Autonomous Driving in Unstructured Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Pritam Chakraborty; Anjan Bandyopadhyay; Siddhartha Bhattacharyya; Jan Platos; | Scientific Reports | 2025-12-20 |
| 235 | SynthSeg-Agents: Multi-Agent Synthetic Data Generation for Zero-Shot Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel direction, Zero Shot Weakly Supervised Semantic Segmentation (ZSWSSS), and propose SynthSeg Agents, a multi agent framework driven by Large Language Models (LLMs) to generate synthetic training data entirely without real images. |
Wangyu Wu; Zhenhong Chen; Xiaowei Huang; Fei Ma; Jimin Xiao; | arxiv-cs.CV | 2025-12-17 |
| 236 | Unified Semantic Transformer for 3D Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce UNITE, a Unified Semantic Transformer for 3D scene understanding, a novel feed-forward neural network that unifies a diverse set of 3D semantic tasks within a single model. |
SEBASTIAN KOCH et. al. | arxiv-cs.CV | 2025-12-16 |
| 237 | Dual-Branch Superpixel and Class-Center Attention Network for Efficient Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we introduce a superpixel sampling weighting module that models pixel dependencies based on different regional affiliations, thereby enhancing the network’s sensitivity to object boundaries while preserving local features. |
YUNTING ZHANG et. al. | Sensors | 2025-12-16 |
| 238 | Pancakes: Consistent Multi-Protocol Image Segmentation Across Biomedical Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Pancakes, a framework that, given a new image from a previously unseen domain, automatically generates multi-label segmentation maps for multiple plausible protocols, while maintaining semantic consistency across related images. |
Marianne Rakic; Siyu Gai; Etienne Chollet; John V. Guttag; Adrian V. Dalca; | arxiv-cs.CV | 2025-12-15 |
| 239 | Semantic Segmentation of Remote Sensing Images Via Visible-Infrared Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, the research of unsupervised domain adaptation method for semantic segmentation of remote sensing images is carried out based on deep learning. |
Yuan Chang; Bin Hui; Qifu Zhang; | International Journal of Pattern Recognition and Artificial … | 2025-12-10 |
| 240 | Modulatory Feedback Determines Attentional Object Segmentation in A Model of The Ventral Stream Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The present study presents a biologically plausible neural network that performs scene segmentation and can shift attention using modulatory feedback connections from higher to lower cortical brain areas. |
Paolo Papale; Jonathan R. Williford; Stijn Balk; Pieter R. Roelfsema; | PLOS One | 2025-12-10 |
| 241 | SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a preliminary exploration of applying SAM 3 to the remote sensing OVSS task without any training. |
KAIYU LI et. al. | arxiv-cs.CV | 2025-12-09 |
| 242 | Human Detection in UAV Thermal Imagery: Dataset Extension and Comparative Evaluation on Embedded Platforms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing datasets are mostly limited to urban or open-field scenarios, and our experiments show that models trained on such heterogeneous data achieve poor results. To address this gap, we collected and annotated thermal images in mountainous environments using a DJI M3T drone under clear daytime conditions. |
Andrei-Alexandru Ulmămei; Taddeo D’Adamo; Costin-Emanuel Vasile; Radu Hobincu; | Journal of Imaging | 2025-12-09 |
| 243 | Dynamic Mutual Adversarial Learning for Semi-Supervised Semantic Segmentation of Underwater Images with Limited and Noisy Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we delineate the formulation of a novel semi-supervised paradigm with dynamic mutual adversarial training for the semantic segmentation of underwater images. |
HAN CHEN et. al. | Journal of Marine Science and Engineering | 2025-12-08 |
| 244 | Generalized Referring Expression Segmentation on Aerial Photos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents Aerial-D, a new large-scale referring expression segmentation dataset for aerial imagery, comprising 37,288 images with 1,522,523 referring expressions that cover 259,709 annotated targets, spanning across individual object instances, groups of instances, and semantic regions covering 21 distinct classes that range from vehicles and infrastructure to land coverage types. |
Luís Marnoto; Alexandre Bernardino; Bruno Martins; | arxiv-cs.CV | 2025-12-08 |
| 245 | Selective Masking Based Self-Supervised Learning for Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a novel self-supervised learning method for semantic segmentation using selective masking image reconstruction as the pretraining task. |
Yuemin Wang; Ian Stavness; | arxiv-cs.CV | 2025-12-07 |
| 246 | Power of Boundary and Reflection: Semantic Transparent Object Segmentation Using Pyramid Vision Transformer with Transparent Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While it is known that human perception relies on boundary and reflective-object features to distinguish glass objects, the existing literature has not yet sufficiently captured both properties when handling transparent objects. Hence, we propose incorporating both of these powerful visual cues via the Boundary Feature Enhancement and Reflection Feature Enhancement modules in a mutually beneficial way. |
TUAN-ANH VU et. al. | arxiv-cs.CV | 2025-12-07 |
| 247 | See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose depth-guided surgical scene segmentation (DepSeg), a training-free framework that utilizes monocular depth as a geometric prior together with pretrained vision foundation models. |
Kunyi Yang; Qingyu Wang; Cheng Yuan; Yutong Ban; | arxiv-cs.CV | 2025-12-05 |
| 248 | The SAM2-to-SAM3 Gap in The Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the fundamental discontinuity between the latest two Segment Anything Models: SAM2 and SAM3. |
Ranjan Sapkota; Konstantinos I. Roumeliotis; Manoj Karkee; | arxiv-cs.CV | 2025-12-04 |
| 249 | Deep Learning Approach for Crop-weed Segmentation in Peanut Cultivation Using PSPEdgeWeedNet Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this context, PSPEdgeWeedNet is proposed, a novel edge-aware deep learning architecture tailored for precise semantic segmentation of crops and weeds within peanut cultivation fields. Distinct from the conventional Pyramid Scene Parsing Network (PSPNet) and its boundary-aware variant developed as a baseline in this research, PSPEdgeWeedNet introduces a dedicated edge detection branch. |
Deepthi G Pai; Mamatha Balachandra; Radhika Kamath; | Scientific Reports | 2025-12-03 |
| 250 | Distilling Hierarchical Knowledge From Multimodal Fusion for Unimodal Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The application of multimodal image fusion has become increasingly widespread across various fields in the era of deep learning. Existing fusion methods integrate infrared and … |
YUJIA SUN et. al. | IEEE Transactions on Circuits and Systems for Video … | 2025-12-01 |
| 251 | UDG-Prom: A Unified Dense-guided Semantic Prompting for Cross-domain Few-shot Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JIAQI YANG et. al. | Knowl. Based Syst. | 2025-12-01 |
| 252 | YGDD‐SLAM: Direct Geometric Constraint SLAM Based on Object Detection and Depth Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Peng Liao; Liheng Chen; Jialiang Tang; Zhengyong Feng; | Journal of Field Robotics | 2025-12-01 |
| 253 | Evaluating SAM2 for Video Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Segmentation Anything Model 2 (SAM2) has proven to be a powerful foundation model for promptable visual object segmentation in both images and videos, capable of storing object-aware memories and transferring them temporally through memory blocks. |
SYED HESHAM SYED ARIFF et. al. | arxiv-cs.CV | 2025-12-01 |
| 254 | Multi-view Consistent Feature Learning for Open-set Semantic Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
HAIXIA WANG et. al. | Expert Syst. Appl. | 2025-12-01 |
| 255 | Semantic-Guided Diffusion Sampling: A Generalized Strategy for Enhancing Object Segmentation Oriented Multimodal Image Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Diffusion models have demonstrated impressive generative capabilities in various computer vision tasks, providing a novel technological approach to the study of multimodal image … |
Yu Shi; Yu Liu; Juan Cheng; Huafeng Li; Xun Chen; | IEEE Journal of Selected Topics in Signal Processing | 2025-12-01 |
| 256 | Reducing Semantic Ambiguity in Open-vocabulary Remote Sensing Image Segmentation Via Knowledge Graph-enhanced Class Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Wubiao Huang; Huchen Li; Shuai Zhang; Fei Deng; | ISPRS Journal of Photogrammetry and Remote Sensing | 2025-11-30 |
| 257 | A Fast and Efficient Modern BERT Based Text-Conditioned Diffusion Model for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose FastTextDiff, a label-efficient diffusion-based segmentation model that integrates medical text annotations to enhance semantic representations. |
Venkata Siddharth Dhara; Pawan Kumar; | arxiv-cs.CV | 2025-11-26 |
| 258 | The Role of U-Net Variants in Semantic Segmentation of Remote Sensing Images: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This survey systematically reviews major U-Net extensions (including U-Net++, ResUNet-a, HCANet, CCT-Net, DIResUNet, CM-UNet, TransUNet, AER-UNet and U-KAN) and additional optimization techniques such as incremental learning. |
Yiyang Liu; | Applied and Computational Engineering | 2025-11-26 |
| 259 | Image Diffusion Models Exhibit Emergent Temporal Propagation in Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we investigate their self-attention maps can be reinterpreted as semantic label propagation kernels, providing robust pixel-level correspondences between relevant image regions. |
Youngseo Kim; Dohyun Kim; Geohee Han; Paul Hongsuck Seo; | arxiv-cs.CV | 2025-11-25 |
| 260 | Vision–Language Enhanced Foundation Model for Semi-supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we integrate VLM-based segmentation into semi-supervised medical image segmentation by introducing a Vision-Language Enhanced Semi-supervised Segmentation Assistant (VESSA) that incorporates foundation-level visual-semantic understanding into SSL frameworks. |
JIAQI GUO et. al. | arxiv-cs.CV | 2025-11-24 |
| 261 | DiffSeg30k: A Multi-Turn Diffusion Editing Benchmark for Localized AIGC Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce DiffSeg30k, a publicly available dataset of 30k diffusion-edited images with pixel-level annotations, designed to support fine-grained detection. |
Hai Ci; Ziheng Peng; Pei Yang; Yingxin Xuan; Mike Zheng Shou; | arxiv-cs.CV | 2025-11-24 |
| 262 | SAM3-Adapter: Efficient Adaptation of Segment Anything 3 for Camouflage Object Segmentation, Shadow Detection, and Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With the emergence of Segment Anything 3 (SAM3)-a more efficient and higher-performing evolution with a redesigned architecture and improved training pipeline-we revisit these long-standing challenges. In this work, we present SAM3-Adapter, the first adapter framework tailored for SAM3 that unlocks its full segmentation capability. |
TIANRUN CHEN et. al. | arxiv-cs.CV | 2025-11-24 |
| 263 | Vision-Language Enhanced Foundation Model for Semi-supervised Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semi-supervised learning (SSL) has emerged as an effective paradigm for medical image segmentation, reducing the reliance on extensive expert annotations. Meanwhile, … |
JIAQI GUO et. al. | ArXiv | 2025-11-24 |
| 264 | SegSplat: Feed-forward Gaussian Splatting and Open-Set Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We have introduced SegSplat, a novel framework designed to bridge the gap between rapid, feed-forward 3D reconstruction and rich, open-vocabulary semantic understanding. |
Peter Siegel; Federico Tombari; Marc Pollefeys; Daniel Barath; | arxiv-cs.CV | 2025-11-23 |
| 265 | Improved COOT Optimization: An Approach to Multilevel Thresholding in Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes the application of an improved COOT (ICOOT) optimization algorithm for multilevel image thresholding. |
SIMRANDEEP SINGH et. al. | Scientific Reports | 2025-11-21 |
| 266 | Analysis of Pedestrian Semantic Segmentation Technology in Autonomous Driving Scenarios Under Occlusion Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a comprehensive survey of both traditional and occlusion-aware semantic segmentation approaches, with a structured analysis of their evolution, strengths, and limitations. |
Yingxin He; | Scientific Journal of Technology | 2025-11-21 |
| 267 | Graph Neural Networks for Surgical Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Methods: We propose two segmentation models integrating Vision Transformer (ViT) feature encoders with Graph Neural Networks (GNNs) to explicitly model spatial relationships between anatomical regions. |
Yihan Li; Nikhil Churamani; Maria Robu; Imanol Luengo; Danail Stoyanov; | arxiv-cs.CV | 2025-11-20 |
| 268 | VideoSeg-R1:Reasoning Video Object Segmentation Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional video reasoning segmentation methods rely on supervised fine-tuning, which limits generalization to out-of-distribution scenarios and lacks explicit reasoning. To address this, we propose \textbf{VideoSeg-R1}, the first framework to introduce reinforcement learning into video reasoning segmentation. |
Zishan Xu; Yifu Guo; Yuquan Lu; Fengyu Yang; Junxin Li; | arxiv-cs.CV | 2025-11-20 |
| 269 | MaskMed: Decoupled Mask and Class Prediction for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a unified decoupled segmentation head that separates multi-class prediction into class-agnostic mask prediction and class label prediction using shared object queries. |
Bin Xie; Gady Agam; | arxiv-cs.CV | 2025-11-19 |
| 270 | Re-purposing SAM Into Efficient Visual Projectors for MLLM-Based Referring Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by text tokenizers, we propose a novel semantic visual projector that leverages semantic superpixels generated by SAM to identify “visual words” in an image. |
Xiaobo Yang; Xiaojin Gong; | ACM Transactions on Multimedia Computing, Communications, … | 2025-11-19 |
| 271 | DiffPixelFormer: Differential Pixel-Aware Transformer for RGB-D Indoor Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although RGB-D fusion leverages complementary appearance and geometric cues, existing methods often depend on computationally intensive cross-attention mechanisms and insufficiently model intra- and inter-modal feature relationships, resulting in imprecise feature alignment and limited discriminative representation. To address these challenges, we propose DiffPixelFormer, a differential pixel-aware Transformer for RGB-D indoor scene segmentation that simultaneously enhances intra-modal representations and models inter-modal interactions. |
YAN GONG et. al. | arxiv-cs.CV | 2025-11-17 |
| 272 | Analysis of Various Image Segmentation Techniques on Retinal Oct Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To reduce the speckle noise during preprocessing, the wiener filter approach is used. |
G. Vyshnavi; Dr. G. Jhansi Reddy; T. Suchitra; M. Sharanya; | INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING … | 2025-11-17 |
| 273 | DBGroup: Dual-Branch Point Grouping for Weakly Supervised 3D Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these approaches still encounter limitations, including labor-intensive annotation processes, high complexity, and reliance on expert annotators. To address these challenges, we propose \textbf{DBGroup}, a two-stage weakly supervised 3D instance segmentation framework that leverages scene-level annotations as a more efficient and scalable alternative. |
Xuexun Liu; Xiaoxu Xu; Qiudan Zhang; Lin Ma; Xu Wang; | arxiv-cs.CV | 2025-11-13 |
| 274 | TrueCity: Real and Simulated Urban Data for Cross-Domain 3D Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce TrueCity,the first urban semantic segmentation benchmark with cm-accurate annotatedreal-world point clouds, semantic 3D city models, and annotated simulated pointclouds representing the same city. |
DUC NGUYEN et. al. | arxiv-cs.CV | 2025-11-10 |
| 275 | Enhancing Semantic Segmentation with A Boundary-sensitive Loss Function: A Novel Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a novel boundary-sensitive loss function, which combines region loss and boundary loss, to enhance both region consistency and edge delineation in segmentation tasks. |
Ganesh R. Padalkar; Madhuri B. Khambete; | International Journal of Electrical and Computer … | 2025-11-08 |
| 276 | GMM-based VAE Model with Normalising Flow for Effective Stochastic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work propose a novel framework by integrating Gaussian Mixture Model (GMM) with Normalizing Flow (NF) in CVAE for stochastic segmentation. |
Conghui Li; Chern Hong Lim; Xin Wang; | nips | 2025-11-07 |
| 277 | Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Seg4Diff (Segmentation for Diffusion), a systematic framework for analyzing the attention structures of MM-DiT, with a focus on how specific layers propagate semantic information from text to image. |
CHAEHYUN KIM et. al. | nips | 2025-11-07 |
| 278 | ARGenSeg: Image Segmentation with Autoregressive Image Generation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These methods rely on discrete representations or semantic prompts fed into task-specific decoders, which limits the ability of the MLLM to capture fine-grained visual details. To address these challenges, we introduce a segmentation framework for MLLM based on image generation, which naturally produces dense masks for target objects. |
XIAOLONG WANG et. al. | nips | 2025-11-07 |
| 279 | $\epsilon$-Seg: Sparsely Supervised Semantic Segmentation of Microscopy Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce $\epsilon$-Seg, a method based on hierarchical variational autoencoders (HVAEs), employing center-region masking, sparse label contrastive learning (CL), a Gaussian mixture model (GMM) prior, and clustering-free label prediction. |
Sheida RahnamaiKordasiabi; Damian Dalle Nogare; Florian Jug; | nips | 2025-11-07 |
| 280 | UniMRSeg: Unified Modality-Relax Segmentation Via Hierarchical Self-Supervised Compensation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a unified modality-relax segmentation network (UniMRSeg) through hierarchical self-supervised compensation (HSSC). |
XIAOQI ZHAO et. al. | nips | 2025-11-07 |
| 281 | UFO: A Unified Approach to Fine-grained Visual Perception Via Open-ended Language Interface IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This is primarily because these tasks often rely heavily on task-specific designs and architectures that can complicate the modeling process. To address this challenge, we present UFO, a framework that unifies fine-grained visual perception tasks through an open-ended language interface. |
HAO TANG et. al. | nips | 2025-11-07 |
| 282 | FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven By Referential Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enable accurate and controllable image editing, we propose a progressive multi-stage training pipeline, where segmentation masks are jointly optimized and used as spatial condition prompts to guide the diffusion decoder. |
FAN YANG et. al. | nips | 2025-11-07 |
| 283 | LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose LangHOPS, the first Multimodal Large Language Model (MLLM)-based framework for open-vocabulary object–part instance segmentation. |
YANG MIAO et. al. | nips | 2025-11-07 |
| 284 | No Object Is An Island: Enhancing 3D Semantic Segmentation Generalization with Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel cross-modal learning framework based on diffusion models to enhance the generalization of 3D semantic segmentation, named XDiff3D. |
Fan Li; Xuan Wang; Xuanbin Wang; Zhaoxiang Zhang; Yuelei Xu; | nips | 2025-11-07 |
| 285 | Pancakes: Consistent Multi-Protocol Image Segmentation Across Biomedical Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce _Pancakes_, a framework that, given a new image from a previously unseen domain, automatically generates multi-label segmentation maps for _multiple_ plausible protocols, while maintaining semantic consistency across related images. |
Marianne Rakic; Siyu Gai; Etienne Chollet; John Guttag; Adrian V Dalca; | nips | 2025-11-07 |
| 286 | Leveraging Depth and Language for Open-Vocabulary Domain-Generalized Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Vireo, a novel single-stage framework for OV-DGSS that unifies the strengths of OVSS and DGSS for the first time. |
SIYU CHEN et. al. | nips | 2025-11-07 |
| 287 | Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In response, we propose a novel TTA method tailored to adapting VLMs for segmentation during test time. |
MEHRDAD NOORI et. al. | nips | 2025-11-07 |
| 288 | Autonomous Semantic Mapping for SLAM Systems IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We proposed an autonomous semantic mapping approach that integrates multimodal semantic segmentation and SLAM techniques to construct a dense 3D semantic map in real time. |
YONG HE et. al. | ISPRS Annals of the Photogrammetry, Remote Sensing and … | 2025-11-03 |
| 289 | MicroAUNet: Boundary-Enhanced Multi-scale Fusion with Knowledge Distillation for Colonoscopy Polyp Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, current deep learning-based polyp segmentation modelseither compromise clinical decision-making by providing ambiguous polyp marginsin segmentation outputs or rely on heavy architectures with high computationalcomplexity, resulting in insufficient inference speeds for real-time colorectalendoscopic applications. To address this problem, we propose MicroAUNet, alight-weighted attention-based segmentation network that combinesdepthwise-separable dilated convolutions with a single-path, parameter-sharedchannel-spatial attention block to strengthen multi-scale boundary features. |
Ziyi Wang; Yuanmei Zhang; Dorna Esrafilzadeh; Ali R. Jalili; Suncheng Xiang; | arxiv-cs.CV | 2025-11-02 |
| 290 | Co2SAM: Exploring Co-Occurrence Challenges With SAM in Weakly Supervised Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Weakly supervised semantic segmentation builds a semantic segmentation model with only image-level annotations, which provide only categorical information without localization … |
CHUNMENG LIU et. al. | IEEE Internet of Things Journal | 2025-11-01 |
| 291 | RM2Occ: Re-Projection Multi-Task Multi-Sensor Fusion for Autonomous Driving 3D Object Detection and Occupancy Perception IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Occupancy prediction plays a crucial role in supporting autonomous driving planning and decision-making. Existing methods typically rely on modular stacking and fusion techniques … |
YILONG REN et. al. | IEEE Transactions on Intelligent Transportation Systems | 2025-11-01 |
| 292 | GenBEV: Generative Model With Semantic Compensation for Bird’s Eye View Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Bird’s-Eye View (BEV) semantic segmentation is a key technology for constructing high-precision maps in low-cost visual navigation systems. The main challenge lies in effectively … |
JIAHUI YU et. al. | IEEE Transactions on Intelligent Transportation Systems | 2025-11-01 |
| 293 | Bridging The Semantic Gap in Medical Image Segmentation Via Multi-scale Dependency and Attention-guided Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
MINGRONG LI et. al. | Scientific Reports | 2025-10-30 |
| 294 | Key Technologies for Real-time Localization and Scene Semantic Segmentation of Mobile Robots in Dynamic Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, the team investigate vision- and lidar-based SLAM methods, lightweight deep neural networks for semantic segmentation, and a multi-sensor fusion framework in various dynamic scenarios. |
Long-Xue Cheng; Jun-Xia Han; | Journal of Computers | 2025-10-28 |
| 295 | EIR-SDG: Explore Invariant Representation for Single-source Domain Generalization in Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose EIR-SDG, a novel SDG approach that explores domain-invariant representation for medical image segmentation. |
ZIWEI NIU et. al. | mm | 2025-10-27 |
| 296 | Stepwise Decomposition and Dual-stream Focus: A Novel Approach for Training-free Camouflaged Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To mitigate the issues above, we propose RDVP-MSD, a novel training-free test-time adaptation framework that synergizes Region-constrained Dual-stream Visual Prompting (RDVP) via Multimodal Stepwise Decomposition Chain of Thought (MSD-CoT). |
CHAO YIN et. al. | mm | 2025-10-27 |
| 297 | What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing paradigms typically perform class-agnostic region segmentation followed by category matching, which deviates from the human visual system’s process of recognizing objects based on semantic concepts, leading to poor alignment between region segmentation and object concepts. To bridge this gap, we propose a novel Cognition-Inspired Framework for open vocabulary image segmentation that emulates the human visual recognition process: first forming a conceptual understanding of an object, then perceiving its spatial extent. |
JIANGHANG LIN et. al. | mm | 2025-10-27 |
| 298 | Graph-Guided Dual-Level Augmentation for 3D Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most augmentation strategies only focus on local transformations or semantic recomposition, lacking the consideration of global structural dependencies within scenes. To address this limitation, we propose a graph-guided data augmentation framework with dual-level constraints for realistic 3D scene synthesis. |
HONGBIN LIN et. al. | mm | 2025-10-27 |
| 299 | A Wheat Spike Image Segmentation Method Based on Improved U-Net Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address issues such as small color differences in the background of wheat spike images in the field and low segmentation accuracy, this article proposes a wheat spike segmentation method called SAU-Net (Striped Pooling and Attention Mechanism optimized U-Net). |
XIAOLEI WANG et. al. | PeerJ Computer Science | 2025-10-27 |
| 300 | VaF-LangSplat: Voxel-Aware Fusion Language Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This approach not only compromises time and space efficiency but also degrades accuracy when selecting optimal semantic levels. To overcome these limitations, we propose Voxel-Aware Fusion Language Gaussian Splatting (VaF-LangSplat), a novel framework that jointly optimizes geometric and semantic representations. |
Changzhou Li; Xinyu Yang; Weiguo Yang; Xinyi Li; | mm | 2025-10-27 |
| 301 | Epipolar Consistency-based Network for Structure-Aware LF Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the other hand, the depth continuity for the same object ensures semantic consistency in adjacent regions. Therefore, effectively extracting structural cues and integrating them into semantic segmentation are key points in LF semantic segmentation.In this paper, we propose an Epipolar Consistency-based network for structure-aware LF semantic segmentation, termed ECNet. |
Chen Gao; Youfang Lin; Wenbin Wang; Shuo Zhang; | mm | 2025-10-27 |
| 302 | Segmenting Objectiveness and Task-awareness Unknown Region for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel framework termed Segmenting Objectiveness and Task-Awareness (SOTA) for autonomous driving scenes. |
MI ZHENG et. al. | mm | 2025-10-27 |
| 303 | PGOV3D: Open-Vocabulary 3D Semantic Segmentation with Partial-to-Global Curriculum Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose PGOV3D, a novel framework that introduces Partialto-Global curriculum to improve Open-Vocabulary 3D semantic segmentation. |
SHIQI ZHANG et. al. | mm | 2025-10-27 |
| 304 | Video-based Transparent Object Segmentation Via Temporal Feature Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, detecting transparent areas from video has not been well explored, especially for different kinds of transparent categories besides glass, due to the scarcity of such a dataset. Therefore, in this paper, we propose the video-based transparent object segmentation task and introduce the first-of-its-kind corresponding dataset named TransVid, which contains nearly 400 videos with a total of 18,523 frames. |
ZHEN WANG et. al. | mm | 2025-10-27 |
| 305 | ACS-SegNet: An Attention-Based CNN-SegFormer Segmentation Network for Tissue Segmentation in Histopathology Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In thisstudy, we propose a novel approach based on attention-driven feature fusion ofconvolutional neural networks (CNNs) and vision transformers (ViTs) within aunified dual-encoder model to improve semantic segmentation performance.Evaluation on two publicly available datasets showed that our model achieved{\mu}IoU/{\mu}Dice scores of 76.79%/86.87% on the GCPS dataset and64.93%/76.60% on the PUMA dataset, outperforming state-of-the-art and baselinebenchmarks. |
Nima Torbati; Anastasia Meshcheryakova; Ramona Woitek; Diana Mechtcheriakova; Amirreza Mahbod; | arxiv-cs.CV | 2025-10-23 |
| 306 | IMPROVING UNDERWATER SEMANTIC SEGMENTATION VIA ADAPTIVE FREQUENCY-AWARE ENHANCEMENT AND THE SEGMENT ANYTHING MODEL Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes a hybrid deep learning framework that combines image enhancement with robust segmentation. |
Abhisheka Thumbesara Eshwara; | International Journal of Applied Mathematics | 2025-10-22 |
| 307 | Semantic Scene Completion from A Single Depth Image with Coarse-Grained Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic Scene Completion (SSC) plays an important role in computer vision applications such as mobile robots navigation. SSC aims to reconstruct a complete 3D volumetric scene … |
Jiun Yen Ching; Lai-Kuan Wong; Fabian Kung; | 2025 Asia Pacific Signal and Information Processing … | 2025-10-22 |
| 308 | Revisiting Efficient Semantic Segmentation: Learning Offsets for Better Spatial and Class Feature Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With experimental analysis, we find that this paradigm results in a highly challenging assumption for efficient scenarios: Image pixel features should not vary for the same category in different images. To address this dilemma, we propose a coupled dual-branch offset learning paradigm that explicitly learns feature and class offsets to dynamically refine both class representations and spatial image features. |
Shi-Chen Zhang; Yunheng Li; Yu-Huan Wu; Qibin Hou; Ming-Ming Cheng; | iccv | 2025-10-20 |
| 309 | Auto-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce Auto-Vocabulary Semantic Segmentation (AVS), advancing open-ended image understanding by eliminating the necessity to predefine object categories for segmentation. |
Osman Ülger; Maksymilian Kulicki; Yuki Asano; Martin R. Oswald; | iccv | 2025-10-20 |
| 310 | Communication-Efficient Multi-Vehicle Collaborative Semantic Segmentation Via Sparse 3D Gaussian Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods share the entire dense, scene-level BEV feature, which contains significant redundancy and lacks height information, ultimately leading to unavoidable bandwidth waste and performance degradation. To address these challenges, we present GSCOOP, the first collaborative semantic segmentation framework that leverages sparse, object-centric 3D Gaussians to fundamentally overcome communication bottlenecks. |
TIANYU HONG et. al. | iccv | 2025-10-20 |
| 311 | SPA: Efficient User-Preference Alignment Against Uncertainty in Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While prior uncertainty-aware and interactive methods offer adaptability, they are inefficient at test time: uncertainty-aware models require users to choose from numerous similar outputs, while interactive models demand significant user input through click or box prompts to refine segmentation. To address these challenges, we propose SPA, a new Segmentation Preference Alignment framework that efficiently adapts to diverse test-time preferences with minimal human interaction. |
Jiayuan Zhu; Junde Wu; Cheng Ouyang; Konstantinos Kamnitsas; J. Alison Noble; | iccv | 2025-10-20 |
| 312 | Feature Purification Matters: Suppressing Outlier Propagation for Training-Free Open-Vocabulary Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Self-adaptive Feature Purifier framework (SFP) to suppress propagated outliers and enhance semantic representations for open-vocabulary semantic segmentation. |
SHUO JIN et. al. | iccv | 2025-10-20 |
| 313 | CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Accordingly, we propose CorrCLIP, which reconstructs the scope and value of patch correlations. |
Dengke Zhang; Fagui Liu; Quan Tang; | iccv | 2025-10-20 |
| 314 | Seeing The Unseen: A Semantic Alignment and Context-Aware Prompt Framework for Open-Vocabulary Camouflaged Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite existing open-vocabulary methods exhibit strong segmentation capabilities, they still have a major limitation in camouflaged scenarios: semantic confusion, which leads to incomplete segmentation and class shift in the model. To mitigate the above limitation, we propose a framework for OVCOS, named SuCLIP. |
Peng Ren; Tian Bai; Jing Sun; Fuming Sun; | iccv | 2025-10-20 |
| 315 | Unsupervised Histopathological Image Semantic Segmentation with Overlapping Patches Consistency Constraint Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a context-based Overlapping Patches Consistency Constraint (OPCC), which employs the consistency constraint between the local overlapping region’s similarity and global context similarity, achieving consistent class representation in similar environments. |
WENTIAN CAI et. al. | iccv | 2025-10-20 |
| 316 | DiSCO-3D : Discovering and Segmenting Sub-Concepts from Open-vocabulary Queries in NeRF Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose DiSCO-3D, the first method addressing the broader problem of 3D Open-Vocabulary Sub-concepts Discovery, which aims to provide a 3D semantic segmentation that adapts to both the scene and user queries. |
Doriand Petit; Steve Bourgeois; Vincent Gay-Bellile; Florian Chabot; Loïc Barthe; | iccv | 2025-10-20 |
| 317 | LIRA: Inferring Segmentation in Large Multi-modal Models with Local Interleaved Region Assistance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These challenges stem primarily from constraints in weak visual comprehension and a lack of fine-grained perception. To alleviate these limitations, we propose LIRA, a framework that capitalizes on the complementary relationship between visual comprehension and segmentation via two key components: (1) Semantic-Enhanced Feature Extractor (SEFE) improves object attribute inference by fusing semantic and pixel-level features, leading to more accurate segmentation; (2) Interleaved Local Visual Coupling (ILVC) autoregressively generates local descriptions after extracting local features based on segmentation masks, offering fine-grained supervision to mitigate hallucinations. |
ZHANG LI et. al. | iccv | 2025-10-20 |
| 318 | HyPiDecoder: Hybrid Pixel Decoder for Efficient Segmentation and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the inefficiency of MSDeformAttn has become a performance bottleneck for segmenters. To address this, we propose the Hyper Pixel Decoder (HyPiDecoder), an improved Pixel Decoder design that replaces parts of the MSDeformAttn layers with convolution-based FPN layers, introducing explicit locality information and significantly boosting inference speed. |
Fengzhe Zhou; Humphrey Shi; | iccv | 2025-10-20 |
| 319 | How Do Optical Flow and Textual Prompts Collaborate to Assist in Audio-Visual Semantic Segmentation? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel collaborative framework, Stepping Stone Plus (SSP), which integrates optical flow and textual prompts to assist the segmentation process. |
Yujian Lee; Peng Gao; Yongqi Xu; Wentao Fan; | iccv | 2025-10-20 |
| 320 | Alleviating Textual Reliance in Medical Language-guided Segmentation Via Prototype-driven Semantic Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, its inherent reliance on paired image-text input, which we refer to as "textual reliance", presents two fundamental limitations: 1) many medical segmentation datasets lack paired reports, leaving a substantial portion of image-only data underutilized for training; and 2) inference is limited to retrospective analysis of cases with paired reports, limiting its applicability in most clinical scenarios where segmentation typically precedes reporting. To address these limitations, we propose ProLearn, the first Prototype-driven Learning framework for language-guided segmentation that fundamentally alleviates textual reliance. |
Shuchang Ye; Usman Naseem; Mingyuan Meng; Jinman Kim; | iccv | 2025-10-20 |
| 321 | HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The remarkable performance of large multimodal models (LMMs) has attracted significant interest from the image segmentation community.To align with the next-token-prediction paradigm, current LMM-driven segmentation methods either use object boundary points to represent masks or introduce special segmentation tokens, whose hidden states are decoded by a segmentation model requiring the original image as input.However, these approaches often suffer from inadequate mask representation and complex architectures, limiting the potential of LMMs.In this work, we propose the Hierarchical Mask Tokenizer (HiMTok), which represents segmentation masks with up to 32 tokens and eliminates the need for the original image during mask de-tokenization. |
Tao Wang; Changxu Cheng; Lingfeng Wang; Senda Chen; Wuyue Zhao; | iccv | 2025-10-20 |
| 322 | CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a novel hierarchical framework, named CLIPer, that hierarchically improves spatial representation of CLIP. |
Lin Sun; Jiale Cao; Jin Xie; Xiaoheng Jiang; Yanwei Pang; | iccv | 2025-10-20 |
| 323 | Images As Noisy Labels: Unleashing The Potential of The Diffusion Model for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a DEnoising learning framework based on the Diffusion model for Open-vocabulary semantic Segmentation, called DEDOS, which is aimed at constructing the scene skeleton. |
Fan Li; Xuanbin Wang; Xuan Wang; Zhaoxiang Zhang; Yuelei Xu; | iccv | 2025-10-20 |
| 324 | Kestrel: 3D Multimodal LLM for Part-Aware Grounded Description Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Part-Aware Point Grounded Description (PaPGD), a challenging task aimed at advancing 3D multimodal learning for fine-grained, part-aware segmentation grounding and detailed explanation of 3D objects. |
Mahmoud Ahmed; Junjie Fei; Jian Ding; Eslam Mohamed Bakr; Mohamed Elhoseiny; | iccv | 2025-10-20 |
| 325 | Efficient Track Anything IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The high computation complexity of image encoder and memory module has limited its applications in real-world tasks, e.g., video object segmentation on mobile devices. To address this limitation, we propose EfficientTAMs, lightweight end-to-end track anything models that produce high-quality results with low latency and small model size. |
YUNYANG XIONG et. al. | iccv | 2025-10-20 |
| 326 | Identity-aware Language Gaussian Splatting for Open-vocabulary 3D Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This inconsistency highly results in mis-labeling where different language embeddings are assigned to the same part of an object. To address this issue, we propose a simple yet powerful method that aligns language embeddings via the identity information. |
SungMin Jang; Wonjun Kim; | iccv | 2025-10-20 |
| 327 | Exploring Weather-aware Aggregation and Adaptation for Semantic Segmentation Under Adverse Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel weather-aware aggregation and adaptation network that leverages characteristic knowledge to achieve weather homogenization and enhance scene perception. |
Yuwen Pan; Rui Sun; Wangkai Li; Tianzhu Zhang; | iccv | 2025-10-20 |
| 328 | MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, SAM is not directly applicable to medical image segmentation due to its inability to predict semantic labels, reliance on additional prompts, and suboptimal performance in this domain. To address these limitations, we propose MaskSAM, a novel prompt-free SAM adaptation framework for medical image segmentation based on mask classification. |
BIN XIE et. al. | iccv | 2025-10-20 |
| 329 | Intelligent Communication Mixture-of-Experts Boosted-Medical Image Segmentation Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: 2) We propose a semantic-guided contrastive learning method to addressthe issue of weak supervision in contrastive learning. |
Xinwei Zhang; Hu Chen; Zhe Yuan; Sukun Tian; Peng Feng; | arxiv-cs.CV | 2025-10-20 |
| 330 | Correspondence As Video: Test-Time Adaption on SAM2 for Reference Segmentation in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we propose a novel approach by representing the inherent correspondence between reference-target image pairs as a pseudo video. |
Haoran Wang; Zekun Li; Jian Zhang; Lei Qi; Yinghuan Shi; | iccv | 2025-10-20 |
| 331 | EVOLVE: Event-Guided Deformable Feature Transfer and Dual-Memory Refinement for Low-Light Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Video Object Segmentation (VOS) in low-light scenarios remains highly challenging due to significant texture loss and severe noise, which often lead to unreliable image feature generation and degraded segmentation performance. To address this issue, we propose EVOLVE, a novel event-guided deformable feature transfer and dual-memory refinement framework for low-light VOS. |
Jong-Hyeon Baek; Jiwon Oh; Yeong Jun Koh; | iccv | 2025-10-20 |
| 332 | Zero-Shot Semantic Segmentation for Robots in Agriculture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Conventional crop production, which is essential for providing food, feed, fuel, and fiber for our society, relies heavily on harmful herbicides to control weeds. Instead, … |
YUE LINN et. al. | 2025 IEEE/RSJ International Conference on Intelligent … | 2025-10-19 |
| 333 | Robust Maritime Object Detection Under Adverse Conditions Via Joint Semantic Learning Without Extra Computational Overhead Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study addresses the challenge of robust object detection in maritime environments, where dynamic conditions such as fog, brightness variations, and motion blur can degrade … |
Junseok Lee; Seongju Lee; Jongwon Kim; Jumi Park; Kyoobin Lee; | 2025 IEEE/RSJ International Conference on Intelligent … | 2025-10-19 |
| 334 | Neuro-Symbolic Spatial Reasoning in Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast to contemporaryVLM correlation-based approaches, we propose Relational Segmentor (RelateSeg)to impose explicit spatial relational constraints by first order logic (FOL)formulated in a neural network architecture. |
Jiayi Lin; Jiabo Huang; Shaogang Gong; | arxiv-cs.CV | 2025-10-17 |
| 335 | Semantic Segmentation with Coarse Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a regularizationmethod for models with an encoder-decoder architecture with superpixel basedupsampling. |
Jort de Jong; Mike Holenderski; | arxiv-cs.CV | 2025-10-17 |
| 336 | A Symmetry-Aware BAS for Improved Fuzzy Intra-Class Distance-Based Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: At present, the Beetle Antennae Search (BAS) algorithm has achieved remarkable success in image segmentation. |
Yazhi Wang; Lei Ding; Qing Zhang; | Symmetry | 2025-10-17 |
| 337 | Multi-Task Traffic Scene Perception Algorithm Based on Multi-Scale Prompter Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The algorithm introduces a multi-scale prompt learning method to obtain rich multi-scale feature maps and prompt words. |
Kaibo Yang; Mingen Zhong; Kang Fan; Jiawei Tan; | Engineering Research Express | 2025-10-15 |
| 338 | The Application and Challenges of Deep Learning in Semantic Segmentation of High-resolution Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the application of deep learning technologies in remote sensing image semantic segmentation, based on Convolutional Neural Networks (CNN) and Transformer-based semantic segmentation methods. |
Shijing Hu; | Advances in Engineering Innovation | 2025-10-14 |
| 339 | A Framework for Low-Effort Training Data Generation for Urban Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To addressthis, we present a new framework that adapts an off-the-shelf diffusion modelto a target domain using only imperfect pseudo-labels. |
DENIS ZAVADSKI et. al. | arxiv-cs.CV | 2025-10-13 |
| 340 | DTEA: Dynamic Topology Weaving and Instability-Driven Entropic Attenuation for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Extensive experiments on threebenchmark datasets show our framework achieves superior segmentation accuracyand better generalization across various clinical settings. |
WEIXUAN LI et. al. | arxiv-cs.CV | 2025-10-13 |
| 341 | MMFNet: A Mamba-Based Multimodal Fusion Network for Remote Sensing Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes MMFNet, a novel multimodal fusion network that leverages the Mamba architecture to efficiently capture long-range dependencies for semantic segmentation tasks. |
Jingting Qiu; Wei Chang; Wei Ren; Shanshan Hou; Ronghao Yang; | Sensors | 2025-10-08 |
| 342 | Temporal Prompting Matters: Rethinking Referring Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Most existing methods requireend-to-end training with dense mask annotations, which could becomputation-consuming and less scalable. In this work, we rethink the RVOSproblem and aim to investigate the key to this task. |
CI-SIANG LIN et. al. | arxiv-cs.CV | 2025-10-08 |
| 343 | SquareNet: Multi-scale Progressive Difference and Scale-cross Attention Network for Volumetric Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we propose a dual encoder-decoder network architecture comprising a multi-scale progressive difference (MSPD) branch and a group scale-cross attention (GSCA) branch. |
HUAXIANG LIU et. al. | Engineering Research Express | 2025-10-08 |
| 344 | Semantic Segmentation Algorithm Based on Light Field and LiDAR Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation serves as a cornerstone of scene understanding inautonomous driving but continues to face significant challenges under complexconditions such as occlusion. … |
Jie Luo; Yuxuan Jiang; Xin Jin; Mingyu Liu; Yihui Fan; | arxiv-cs.CV | 2025-10-08 |
| 345 | Uncertainty Driven Adaptive Self-Knowledge Distillation for Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deep learning has recently significantly improved the precision of medical image segmentation. However, due to the commonly limited dataset scale and reliance on hard labels … |
XUTAO GUO et. al. | IEEE Transactions on Emerging Topics in Computational … | 2025-10-01 |
| 346 | Transferable Image Synthesis for Remote Sensing Semantic Segmentation Via Joint Reference-semantic Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
RUNMIN DONG et. al. | Inf. Fusion | 2025-10-01 |
| 347 | Semantic Knowledge Transfer for Semi-supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Shiwei Zhou; Haifeng Zhao; Leilei Ma; Dengdi Sun; | Eng. Appl. Artif. Intell. | 2025-10-01 |
| 348 | DBiSeNet: Dual Bilateral Segmentation Network for Real-time Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xiaobo Hu; Hongbo Zhu; Ning Su; Taosheng Xu; | Comput. Vis. Image Underst. | 2025-10-01 |
| 349 | A Medical Image Semantic Segmentation Method Based on Multi-color Space Information Fusion Strategy Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YUEFEI WANG et. al. | Inf. Fusion | 2025-10-01 |
| 350 | CORE-3D: Context-aware Open-vocabulary Retrieval By Embeddings in 3D Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods oftenproduce fragmented masks and inaccurate semantic assignments due to the directuse of raw masks, limiting their effectiveness in complex environments. Toaddress this, we leverage SemanticSAM with progressive granularity refinementto generate more accurate and numerous object-level masks, mitigating theover-segmentation commonly observed in mask generation models such as vanillaSAM, and improving downstream 3D semantic segmentation. |
Mohamad Amin Mirzaei; Pantea Amoie; Ali Ekhterachian; Matin Mirzababaei; Babak Khalaj; | arxiv-cs.CV | 2025-09-29 |
| 351 | 2nd Place Report of MOSEv2 Challenge 2025: Concept Guided Video Object Segmentation Via SeC Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we evaluate itszero-shot performance on the challenging coMplex video Object SEgmentation v2(MOSEv2) dataset. |
ZHIXIONG ZHANG et. al. | arxiv-cs.CV | 2025-09-28 |
| 352 | SSVIF: Self-Supervised Segmentation-Oriented Visible and Infrared Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However,compared to traditional methods, application-oriented VIF methods requiredatasets labeled for downstream tasks (e.g., semantic segmentation or objectdetection), making data acquisition labor-intensive and time-consuming. Toaddress this issue, we propose a self-supervised training framework forsegmentation-oriented VIF methods (SSVIF). |
Zixian Zhao; Xingchen Zhang; | arxiv-cs.CV | 2025-09-26 |
| 353 | SwinMamba: A Hybrid Local-global Mamba Framework for Enhancing Semantic Segmentation of Remotely Sensed Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation of remote sensing imagery is a fundamental task incomputer vision, supporting a wide range of applications such as land useclassification, urban planning, … |
Qinfeng Zhu; Han Li; Liang He; Lei Fan; | arxiv-cs.CV | 2025-09-25 |
| 354 | Boosting LiDAR-Based Localization with Semantic Insight: Camera Projection Versus Direct LiDAR Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Semantic segmentation of LiDAR data presents considerable challenges,particularly when dealing with diverse sensor types and configurations.However, incorporating semantic information can significantly enhance theaccuracy and robustness of LiDAR-based localization techniques for autonomousmobile systems. We propose an approach that integrates semantic camera datawith LiDAR segmentation to address this challenge. |
Sven Ochs; Philip Schörner; Marc René Zofka; J. Marius Zöllner; | arxiv-cs.RO | 2025-09-24 |
| 355 | Seg4Diff: Unveiling Open-Vocabulary Segmentation in Text-to-Image Diffusion Transformers IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Seg4Diff(Segmentation for Diffusion), a systematic framework for analyzing theattention structures of MM-DiT, with a focus on how specific layers propagatesemantic information from text to image. |
CHAEHYUN KIM et. al. | arxiv-cs.CV | 2025-09-22 |
| 356 | MSGFusion: Multimodal Scene Graph-Guided Infrared and Visible Image Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Infrared and visible image fusion has garnered considerable attention owingto the strong complementarity of these two modalities in complex, harshenvironments. While deep … |
Guihui Li; Bowei Dong; Kaizhi Dong; Jiayi Li; Haiyong Zheng; | arxiv-cs.CV | 2025-09-16 |
| 357 | Microsurgical Instrument Segmentation for Robot-Assisted Surgery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Microsurgery Instrument Segmentation for RoboticAssistance(MISRA), a segmentation framework that augments RGB input withluminance channels, integrates skip attention to preserve elongated features,and employs an Iterative Feedback Module(IFM) for continuity restoration acrossmultiple passes. |
Tae Kyeong Jeong; Garam Kim; Juyoun Park; | arxiv-cs.CV | 2025-09-15 |
| 358 | Instance-Guided Class Activation Mapping for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our approach demonstrates superior localization accuracy, withcomplete object coverage and precise boundary delineation, while maintainingcomputational efficiency. |
Ali Torabi; Sanjog Gaihre; MD Mahbubur Rahman; Yaqoob Majeed; | arxiv-cs.CV | 2025-09-15 |
| 359 | MAFS: Masked Autoencoder for Infrared-Visible Image Fusion and Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Infrared-visible image fusion methods aim at generating fused images withgood visual quality and also facilitate the performance of high-level tasks.Indeed, existing semantic-driven methods have considered semantic informationinjection for downstream applications. |
Liying Wang; Xiaoli Zhang; Chuanmin Jia; Siwei Ma; | arxiv-cs.CV | 2025-09-15 |
| 360 | OpenUrban3D: Annotation-Free Open-Vocabulary Semantic Segmentation of Large-Scale Urban Point Clouds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The mainobstacles are the frequent absence of high-quality, well-aligned multi-viewimagery in large-scale urban point cloud datasets and the poor generalizationof existing three-dimensional (3D) segmentation pipelines across diverse urbanenvironments with substantial variation in geometry, scale, and appearance. Toaddress these challenges, we present OpenUrban3D, the first 3D open-vocabularysemantic segmentation framework for large-scale urban scenes that operateswithout aligned multi-view images, pre-trained point cloud segmentationnetworks, or manual annotations. |
Chongyu Wang; Kunlei Jing; Jihua Zhu; Di Wang; | arxiv-cs.CV | 2025-09-13 |
| 361 | SCOPE: Speech-guided COllaborative PErception Framework for Surgical Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a speech-guided collaborativeperception (SCOPE) framework that integrates reasoning capabilities of largelanguage model (LLM) with perception capabilities of open-set VFMs to supporton-the-fly segmentation, labeling and tracking of surgical instruments andanatomy in intraoperative video streams. |
Jecia Z. Y. Mao; Francis X Creighton; Russell H Taylor; Manish Sahu; | arxiv-cs.CV | 2025-09-12 |
| 362 | SEEC: Segmentation-Assisted Multi-Entropy Models for Learned Lossless Image Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existingapproaches employ a single entropy model to estimate the probabilitydistribution of pixel values across the entire image, which limits theirability to capture the diverse statistical characteristics of differentsemantic regions. To overcome this limitation, we propose Segmentation-AssistedMulti-Entropy Models for Lossless Image Compression (SEEC). |
Chunhang Zheng; Zichang Ren; Dou Li; | arxiv-cs.CV | 2025-09-09 |
| 363 | Point Linguist Model: Segment Any Object Via Bridged Large 3D-Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: At the output stage, predictions depend only on dense featureswithout explicit geometric cues, leading to a loss of fine-grained accuracy. Toaddress these limitations, we present the Point Linguist Model (PLM), a generalframework that bridges the representation gap between LLMs and dense 3D pointclouds without requiring large-scale pre-alignment between 3D-text or3D-images. |
Zhuoxu Huang; Mingqi Gao; Jungong Han; | arxiv-cs.CV | 2025-09-09 |
| 364 | Text4Seg++: Advancing Image Segmentation Via Generative Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose anovel text-as-mask paradigm that casts image segmentation as a text generationproblem, eliminating the need for additional decoders and significantlysimplifying the segmentation process. |
MENGCHENG LAN et. al. | arxiv-cs.CV | 2025-09-08 |
| 365 | Unleashing Hierarchical Reasoning: An LLM-Driven Framework for Training-Free Referring Video Object Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Referring Video Object Segmentation (RVOS) aims to segment an object ofinterest throughout a video based on a language description. The prominentchallenge lies in aligning static … |
BINGRUI ZHAO et. al. | arxiv-cs.CV | 2025-09-06 |
| 366 | DiFusionSeg: Diffusion-driven Semantic Segmentation with Multi-modal Image Fusion for Enhanced Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
ZHIWEI WANG et. al. | Knowl. Based Syst. | 2025-09-01 |
| 367 | Multiview Space Function Classification in Apartment Buildings Using Image Deep-Learning Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Amir Ziaee; Georg Suter; | J. Comput. Civ. Eng. | 2025-09-01 |
| 368 | Guided Model-based LiDAR Super-Resolution for Resource-Efficient Automotive Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To overcomethis, we introduce the first end-to-end framework that jointly addresses LiDARsuper-resolution (SR) and semantic segmentation. |
Alexandros Gkillas; Nikos Piperigkos; Aris S. Lalos; | arxiv-cs.CV | 2025-09-01 |
| 369 | Salient Object Detection Enhanced Pseudo-labels for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YUNPING ZHENG et. al. | J. Vis. Commun. Image Represent. | 2025-09-01 |
| 370 | Domain Consistency Learning for Continual Test-time Adaptation in Image Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yanyu Ye; Wei Wei; Lei Zhang; Chen Ding; Yanning Zhang; | Pattern Recognit. | 2025-09-01 |
| 371 | CMFS: CLIP-Guided Modality Interaction for Mitigating Noise in Multi-Modal Image Fusion and Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Infrared-visible image fusion and semantic segmentation are pivotal tasks for robust scene understanding under challenging conditions such as low light. However, existing methods … |
Guilin Su; Yuqing Huang; Chao Yang; Zhenyu He; | International Joint Conference on Artificial Intelligence | 2025-09-01 |
| 372 | Scene Adaptation Network for Visual-thermal Urban Scene Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Houwang Zhang; Yong-Jie Li; L. Chan; | Eng. Appl. Artif. Intell. | 2025-09-01 |
| 373 | CSFAFormer: Category-selective Feature Aggregation Transformer for Multimodal Remote Sensing Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yue Ni; Donglin Xue; Weijian Chi; Ji Luan; Jiahang Liu; | Inf. Fusion | 2025-09-01 |
| 374 | VoCap: Video Object Captioning and Segmentation from Any Prompt Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Understanding objects in videos in terms of fine-grained localization masksand detailed semantic properties is a fundamental task in video understanding.In this paper, we propose VoCap, a flexible video model that consumes a videoand a prompt of various modalities (text, box or mask), and produces aspatio-temporal masklet with a corresponding object-centric caption. |
JASPER UIJLINGS et. al. | arxiv-cs.CV | 2025-08-29 |
| 375 | LabelGS: Label-Aware 3D Gaussian Splatting for 3D Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The identification andisolating of specific object components is crucial. To address this limitation,we propose Label-aware 3D Gaussian Splatting (LabelGS), a method that augmentsthe Gaussian representation with object label.LabelGS introduces cross-viewconsistent semantic masks for 3D Gaussians and employs a novel OcclusionAnalysis Model to avoid overfitting occlusion during optimization, MainGaussian Labeling model to lift 2D semantic prior to 3D Gaussian and GaussianProjection Filter to avoid Gaussian label conflict. |
YUPENG ZHANG et. al. | arxiv-cs.CV | 2025-08-27 |
| 376 | GS: Generative Segmentation Via Label Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, wepropose GS (Generative Segmentation), a novel framework that formulatessegmentation itself as a generative task via label diffusion. |
Yuhao Chen; Shubin Chen; Liang Lin; Guangrun Wang; | arxiv-cs.CV | 2025-08-27 |
| 377 | ArgusCogito: Chain-of-Thought for Cross-Modal Synergy and Omnidirectional Reasoning in Camouflaged Object Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Camouflaged Object Segmentation (COS) poses a significant challenge due tothe intrinsic high similarity between targets and backgrounds, demanding modelscapable of profound … |
JIANWEN TAN et. al. | arxiv-cs.CV | 2025-08-25 |
| 378 | Continual Learning for Weakly-Supervised Histopathology Tissue Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Weakly supervised histopathology segmentation is a widely studied field that aims to achieve pixel-level semantic segmentation using image-level annotations, reducing the need for … |
Wei-Hua Li; Huei-Fang Yang; Chu-Song Chen; | 2025 IEEE Conference on Computational Intelligence in … | 2025-08-20 |
| 379 | Enhancing Nighttime Semantic Segmentation with Visual-Linguistic Priors and Wavelet Transform Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, effectively employing visual-linguistic priors for nighttime semantic segmentation remains underexplored. To address these issues, we propose Text-WaveletFormer, a novel end-to-end framework that integrates text prompts and wavelet-based texture enhancement. |
Jianhou Zhou; Xiaolong Zhou; Sixian Chan; Zhaomin Chen; Xiaoqin Zhang; | ijcai | 2025-08-16 |
| 380 | DenseSAM: Semantic Enhance SAM for Efficient Dense Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, distinguishing numerous similar and densely packed objects in this task presents significant challenges. Several methods, including CNN- and ViT-based approaches, have been proposed to tackle these issues. |
LINYUN ZHOU et. al. | ijcai | 2025-08-16 |
| 381 | Object-Level Backdoor Attacks in RGB-T Semantic Segmentation with Cross-Modality Trigger Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We overcome the critical limitation of current segmentation backdoor attacks that indiscriminately compromise all objects of a victim class, failing to provide fine-grained control for selectively targeting specific objects as required by adversaries. To address this, we introduce a novel Object-level Backdoor Attack pipeline, termed OBA. |
XIANGHAO JIAO et. al. | ijcai | 2025-08-16 |
| 382 | FedSaaS: Class-Consistency Federated Semantic Segmentation Via Global Prototype Supervision and Local Adversarial Harmonization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This oversight results in ambiguities between class representation. To overcome this challenge, we propose a novel federated segmentation framework that strikes class consistency, termed FedSaaS. |
XIAOYANG YU et. al. | ijcai | 2025-08-16 |
| 383 | Unlocking Robust Semantic Segmentation Performance Via Label-only Elastic Deformations Against Implicit Label Noise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Typical dataaugmentation methods, which apply identical transformations to the image andits label, risk amplifying these subtle imperfections and limiting the model’sgeneralization capacity. In this paper, we introduce NSegment+, a novelaugmentation framework that decouples image and label transformations toaddress such realistic noise for semantic segmentation. |
YECHAN KIM et. al. | arxiv-cs.CV | 2025-08-14 |
| 384 | Semantic-aware DropSplat: Adaptive Pruning of Redundant Gaussians for 3D Aerial-View Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limits theirsegmentation accuracy and consistency. To tackle these challenges, we propose anovel 3D-AVS-SS approach named SAD-Splat. |
Xu Tang; Junan Jia; Yijing Wang; Jingjing Ma; Xiangrong Zhang; | arxiv-cs.CV | 2025-08-13 |
| 385 | Multi-Sequence Parotid Gland Lesion Segmentation Via Expert Text-Guided Segment Anything Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Besides, current medical image segmentation methods areautomatically generated, ignoring the domain knowledge of medical experts whenperforming segmentation. To address these limitations, we propose the parotidgland segment anything model (PG-SAM), an expert diagnosis text-guided SAMincorporating expert domain knowledge for cross-sequence parotid gland lesionsegmentation. |
ZHONGYUAN WU et. al. | arxiv-cs.CV | 2025-08-13 |
| 386 | Instance Segmentation of Scene Sketches Using Natural Image Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce InkLayer, a method for instance segmentation of raster scene sketches. |
Mia Tang; Yael Vinker; Chuan Yan; Lvmin Zhang; Maneesh Agrawala; | siggraph | 2025-08-10 |
| 387 | DreamCraft: Interactive 3D Scene Creation from Editable Panorama in Virtual Reality Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Creating interactive 3D scenes often requires technical expertise and significant time, limiting accessibility for non-experts. To address this, we present DreamCraft, a VR system … |
Cheng-Chih Tsai; Tse-Yu Pan; | Proceedings of the Special Interest Group on Computer … | 2025-08-09 |
| 388 | A Semantic Segmentation Algorithm for Pleural Effusion Based on DBIF-AUNet Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods often struggle with diverse image variations andcomplex edges, primarily because direct feature concatenation causes semanticgaps. To address these challenges, we propose the Dual-Branch InteractiveFusion Attention model (DBIF-AUNet). |
RUIXIANG TANG et. al. | arxiv-cs.CV | 2025-08-08 |
| 389 | TEFormer: Texture-Aware and Edge-Guided Transformer for Semantic Segmentation of Urban Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However,geospatial objects often exhibit subtle texture differences and similar spatialstructures, which can easily lead to semantic ambiguity and misclassification.Moreover, challenges such as irregular object shapes, blurred boundaries, andoverlapping spatial distributions of semantic objects contribute to complex anddiverse edge morphologies, further complicating accurate segmentation. Totackle these issues, we propose a texture-aware and edge-guided Transformer(TEFormer) that integrates texture awareness and edge-guidance mechanisms forsemantic segmentation of URSIs. |
Guoyu Zhou; Jing Zhang; Yi Yan; Hui Zhang; Li Zhuo; | arxiv-cs.CV | 2025-08-08 |
| 390 | Open-world Point Cloud Semantic Segmentation: A Human-in-the-loop Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theselimitations, we propose HOW-Seg, the first human-in-the-loop framework forOW-Seg. |
Peng Zhang; Songru Yang; Jinsheng Sun; Weiqing Li; Zhiyong Su; | arxiv-cs.CV | 2025-08-06 |
| 391 | Enhanced Urban Driving Scene Segmentation Using Modified UNet with Residual Convolutions and Attention Guided Skip Connections Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Siddhant Arora; Ahaan Banerjee; Nitish Katal; | Discover Artificial Intelligence | 2025-08-05 |
| 392 | Dynamic Robot-Assisted Surgery with Hierarchical Class-Incremental Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work,we build upon the recently introduced Taxonomy-Oriented Poincar\’e-regularizedIncremental Class Segmentation (TOPICS) approach and propose an enhancedvariant, termed TOPICS+, specifically tailored for robust segmentation ofsurgical scenes. |
JULIA HINDEL et. al. | arxiv-cs.CV | 2025-08-03 |
| 393 | Transferring Prior Thermal Knowledge for Snowy Urban Scene Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: RGB-thermal (RGB-T) semantic segmentation enables intelligent vehicles to understand environments while operating in urban scenes. However, the research encounters two main … |
XIAODONG GUO et. al. | IEEE Transactions on Intelligent Transportation Systems | 2025-08-01 |
| 394 | Semantic Hierarchy-Guided Adversarial Attack for Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Autonomous vehicles employ semantic segmentation as a foundational component for perception and scene understanding, upon which driving decisions can be informed. Despite their … |
Gwangbin Kim; Seungjun Kim; | IEEE Robotics and Automation Letters | 2025-08-01 |
| 395 | Large Model Guided Semantic Segment Anything for Underwater Consumer Electronics Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid advancement of artificial intelligence (AI), large model techniques are increasingly being integrated into underwater consumer electronics, enhancing the … |
Qirui Lin; Hua Li; Yuheng Jia; | IEEE Transactions on Consumer Electronics | 2025-08-01 |
| 396 | A Survey of Deep Learning in Histopathological Nuclear Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Histopathological images contain rich information that can be used to diagnose and monitor disease progression and to predict patient survival. Accurate morphological … |
Lulu Qin; Xiao Yang; Xianhong Xu; Zexuan Zhu; | IEEE Computational Intelligence Magazine | 2025-08-01 |
| 397 | Learning Semantic Directions for Feature Augmentation in Domain-Generalized Medical Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This uniquecharacteristic makes medical image segmentation particularly challenging. To address this challenge, we propose a domain generalization frameworktailored for medical image segmentation. |
YINGKAI WANG et. al. | arxiv-cs.CV | 2025-07-31 |
| 398 | Graph-Guided Dual-Level Augmentation for 3D Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most augmentation strategies only focus on localtransformations or semantic recomposition, lacking the consideration of globalstructural dependencies within scenes. To address this limitation, we propose agraph-guided data augmentation framework with dual-level constraints forrealistic 3D scene synthesis. |
HONGBIN LIN et. al. | arxiv-cs.CV | 2025-07-30 |
| 399 | Dual Cross-image Semantic Consistency with Self-aware Pseudo Labeling for Semi-supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Ingeneral, current approaches, which rely on intra-image pixel-wise consistencytraining via pseudo-labeling, overlook the consistency at more comprehensivesemantic levels (e.g., object region) and suffer from severe discrepancy ofextracted features resulting from an imbalanced number of labeled and unlabeleddata. To overcome these limitations, we present a new \underline{Du}al\underline{C}ross-\underline{i}mage \underline{S}emantic\underline{C}onsistency (DuCiSC) learning framework, for semi-supervisedmedical image segmentation. |
Han Wu; Chong Wang; Zhiming Cui; | arxiv-cs.CV | 2025-07-28 |
| 400 | Solving Scene Understanding for Autonomous Navigation in Unstructured Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The paper discusses thedataset, exploratory data analysis, preparation, implementation of the fivemodels and studies the performance and compares the results achieved in theprocess. |
Naveen Mathews Renji; Kruthika K; Manasa Keshavamurthy; Pooja Kumari; S. Rajarajeswari; | arxiv-cs.CV | 2025-07-27 |
| 401 | Object Segmentation in The Wild with Foundation Models: Application to Vision Assisted Neuro-prostheses for Upper Limbs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we address the problem of semantic object segmentation usingfoundation models. |
Bolutife Atoki; Jenny Benois-Pineau; Renaud Péteri; Fabien Baldacci; Aymar de Rugy; | arxiv-cs.CV | 2025-07-24 |
| 402 | Robust Noisy Pseudo-label Learning for Semi-supervised Medical Image Segmentation Using Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel diffusion-based framework forsemi-supervised medical image segmentation. |
LIN XI et. al. | arxiv-cs.CV | 2025-07-22 |
| 403 | SeC: Advancing Complex Video Object Segmentation Via Progressive Concept Construction IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This limitation arises from their reliance on appearancematching, neglecting the human-like conceptual understanding of objects thatenables robust identification across temporal dynamics. Motivated by this gap,we propose Segment Concept (SeC), a concept-driven segmentation framework thatshifts from conventional feature matching to the progressive construction andutilization of high-level, object-centric representations. |
ZHIXIONG ZHANG et. al. | arxiv-cs.CV | 2025-07-21 |
| 404 | Improved Semantic Segmentation from Ultra-Low-Resolution RGB Images Applied to Privacy-Preserving Object-Goal Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a novel fullyjoint-learning method that integrates an agglomerative feature extractor and asegmentation-aware discriminator to solve ultra-low-resolution semanticsegmentation, thereby enabling privacy-preserving, semantic object-goalnavigation. |
Xuying Huang; Sicong Pan; Olga Zatsarynna; Juergen Gall; Maren Bennewitz; | arxiv-cs.RO | 2025-07-21 |
| 405 | DiSCO-3D : Discovering and Segmenting Sub-Concepts from Open-vocabulary Queries in NeRF Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose DiSCO-3D, thefirst method addressing the broader problem of 3D Open-Vocabulary Sub-conceptsDiscovery, which aims to provide a 3D semantic segmentation that adapts to boththe scene and user queries. |
Doriand Petit; Steve Bourgeois; Vincent Gay-Bellile; Florian Chabot; Loïc Barthe; | arxiv-cs.CV | 2025-07-19 |
| 406 | A Novel Downsampling Strategy Based on Information Complementarity for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The results show that the HPDmodule provides an efficient solution for semantic segmentation tasks. |
Wenbo Yue; Chang Li; Guoping Xu; | arxiv-cs.CV | 2025-07-19 |
| 407 | Semantic Segmentation Based Scene Understanding in Autonomous Vehicles Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inthis work, we propose several efficient models to investigate sceneunderstanding through semantic segmentation. |
Ehsan Rassekh; | arxiv-cs.CV | 2025-07-18 |
| 408 | Semantic Image Segmentation Via Dynamic Curriculum Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
XIANG ZHANG et. al. | Applied Intelligence | 2025-07-17 |
| 409 | A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a privacy-preserving semantic-segmentation method for applyingperceptual encryption to images used for model training in addition to testimages. |
Homare Sueyoshi; Kiyoshi Nishikawa; Hitoshi Kiya; | arxiv-cs.CV | 2025-07-16 |
| 410 | On Splitting Lightweight Semantic Image Segmentation for Wireless Communications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes anovel approach to implementing semantic communication based on splitting thesemantic image segmentation process between a resource constrained transmitterand the receiver. |
Ebrahim Abu-Helalah; Jordi Serra; Jordi Perez-Romero; | arxiv-cs.NI | 2025-07-14 |
| 411 | Segmentation Similarity Enhanced Semantic Related Entity Fusion for Multi-modal Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The segmentation of semantic data, including image segmentation and word-level descriptions, often contain implicit relationships between entities that are frequently overlooked by existing methodologies, thus limiting the effectiveness of reasoning tasks. Therefore, we propose a novel completion inference method based on fine-grained semantic segmentation, which enhances reasoning capability by utilizing implicit relationships between entities. |
Yunpeng Wang; Bo Ning; Xin Wang; Chengfei Liu; Guanyu Li; | sigir | 2025-07-13 |
| 412 | Hybrid Transformer-CNN Architecture for Enhanced Underwater Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jiaxing Zhang; Yujuan Sun; Hua Wang; Xiaofeng Zhang; | The Visual Computer | 2025-07-12 |
| 413 | Image Translation with Kernel Prediction Networks for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel imagetranslation method, Domain Adversarial Kernel Prediction Network (DA-KPN), thatguarantees semantic matching between the synthetic label and translation.DA-KPN estimates pixel-wise input transformation parameters of a lightweightand simple translation function. |
Cristina Mata; Michael S. Ryoo; Henrik Turbell; | arxiv-cs.CV | 2025-07-11 |
| 414 | MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present MUVOD, a new multi-view video dataset fortraining and evaluating object segmentation in reconstructed real-worldscenarios. |
BANGNING WEI et. al. | arxiv-cs.CV | 2025-07-10 |
| 415 | StixelNExT++: Lightweight Monocular Scene Segmentation and Representation for Collective Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents StixelNExT++, a novel approach to scene representationfor monocular perception systems. |
Marcel Vosshans; Omar Ait-Aider; Youcef Mezouar; Markus Enzweiler; | arxiv-cs.CV | 2025-07-09 |
| 416 | RSRefSeg 2: Decoupling Referring Remote Sensing Image Segmentation with Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Referring Remote Sensing Image Segmentation provides a flexible andfine-grained framework for remote sensing scene analysis via vision-languagecollaborative interpretation. |
KEYAN CHEN et. al. | arxiv-cs.CV | 2025-07-08 |
| 417 | MOSU: Autonomous Long-range Robot Navigation with Multi-modal Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present MOSU, a novel autonomous long-range navigation system thatenhances global navigation for mobile robots through multimodal perception andon-road scene understanding. |
JING LIANG et. al. | arxiv-cs.RO | 2025-07-07 |
| 418 | CoT-Segmenter: Enhancing OOD Detection in Dense Road Scenes Via Chain-of-Thought Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the presentedchallenges, we propose a novel CoT-based framework targeting OOD detection inroad anomaly scenes. |
Jeonghyo Song; Kimin Yun; DaeUng Jo; Jinyoung Kim; Youngjoon Yoo; | arxiv-cs.CV | 2025-07-05 |
| 419 | No Time to Train! Training-Free Reference-Based Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We find that correspondences enable automaticgeneration of instance-level segmentation masks for downstream tasks andinstantiate our ideas via a multi-stage, training-free method incorporating (1)memory bank construction; (2) representation aggregation and (3) semantic-awarefeature matching. |
Miguel Espinosa; Chenhongyi Yang; Linus Ericsson; Steven McDonagh; Elliot J. Crowley; | arxiv-cs.CV | 2025-07-03 |
| 420 | A Gift from The Integration of Discriminative and Diffusion-based Generative Learning: Boundary Refinement Remote Sensing Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our theoretical analysis confirmsthat the diffusion denoising process significantly enhances the model’s abilityto learn high-frequency features; however, we also observe that these modelsexhibit insufficient semantic inference for low-frequency features when guidedsolely by the original image. Therefore, we integrate the strengths of bothdiscriminative and generative learning, proposing the Integration ofDiscriminative and diffusion-based Generative learning for Boundary Refinement(IDGBR) framework. |
Hao Wang; Keyan Hu; Xin Guo; Haifeng Li; Chao Tao; | arxiv-cs.CV | 2025-07-02 |
| 421 | SCDF: Seeing Clearly Through Dark and Fog, An Adaptive Semantic Segmentation Scheme for Autonomous Vehicle Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is a pivotal research area in the advancement of autonomous driving, with a particular focus on addressing adverse weather conditions such as night, rain, … |
Zuobing Ying; Zhengcheng Lin; Zhenyu Li; Xiaochun Huang; Weiping Ding; | IEEE Transactions on Intelligent Transportation Systems | 2025-07-01 |
| 422 | Semantic-MoSeg: Semantics-Assisted Moving-Obstacle Segmentation in Bird-Eye-View for Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Bird-eye-view (BEV) perception for autonomous driving has become popular in recent years. Among various BEV perception tasks, moving-obstacle segmentation is very important, since … |
Shiyu Meng; Yuxiang Sun; | IEEE Transactions on Intelligent Transportation Systems | 2025-07-01 |
| 423 | A Weight-sharing Based RGB-T Image Semantic Segmentation Network with Hierarchical Feature Enhancement and Progressive Feature Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Weimin Xue; Yisha Liu; Zhuang Yan; | Neurocomputing | 2025-07-01 |
| 424 | SCSNet: Semantic Segmentation of Carbon Source and Sink in Remote Sensing Images Based on Multi-scale Transformer and Local Feature Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Achieving carbon peak and carbon neutrality is a major strategic goal for China. With the gradual expansion of China’s carbon market, carbon monitoring plays an increasingly … |
Yang Liu; Wenqian Cao; Haige Xu; Yi Xie; Changwei Miao; | 2025 International Joint Conference on Neural Networks … | 2025-06-30 |
| 425 | PlantSegNeRF: A Few-shot, Cross-dataset Method for Plant 3D Instance Point Cloud Reconstruction Via Joint-channel NeRF with Multi-view Image Instance Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we proposed anovel approach called plant segmentation neural radiance fields (PlantSegNeRF),aiming to directly generate high-precision instance point clouds frommulti-view RGB image sequences for a wide range of plant species. |
XIN YANG et. al. | arxiv-cs.CV | 2025-06-30 |
| 426 | High-quality Pseudo-labeling for Point Cloud Segmentation with Scene-level Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consequently, toenhance accuracy, this paper proposes a high-quality pseudo-label generationframework by exploring contemporary multi-modal information and region-pointsemantic consistency. |
Lunhao Duan; Shanshan Zhao; Xingxing Weng; Jing Zhang; Gui-Song Xia; | arxiv-cs.CV | 2025-06-29 |
| 427 | FA-Seg: A Fast and Accurate Diffusion-Based Method for Open-Vocabulary Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present FA-Seg, aFast and Accurate training-free framework for open-vocabulary segmentationbased on diffusion models. |
Quang-Huy Che; Vinh-Tiep Nguyen; | arxiv-cs.CV | 2025-06-29 |
| 428 | Layer Decomposition and Morphological Reconstruction for Task-Oriented Infrared Image Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, achievingcontrast enhancement without amplifying noise and losing important informationremains a challenge. To address these challenges, we propose a task-orientedinfrared image enhancement method. |
Siyuan Chai; Xiaodong Guo; Tong Liu; | arxiv-cs.CV | 2025-06-29 |
| 429 | SDRNET: Stacked Deep Residual Network for Accurate Semantic Segmentation of Fine-Resolution Remotely Sensed Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article presents a stackeddeep residual network (SDRNet) for semantic segmentation from FRRS images. |
NAFTALY WAMBUGU et. al. | arxiv-cs.CV | 2025-06-27 |
| 430 | Better to Teach Than to Give: Domain Generalized Semantic Segmentation Via Agent Queries with Diffusion Model Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel agent \textbf{Query}-driven learning framework based on \textbf{Diff}usion model guidance for DGSS, named QueryDiff. |
Fan Li; Xuan Wang; Min Qi; Zhaoxiang Zhang; yuelei xu; | icml | 2025-06-25 |
| 431 | A Global-Local Cross-Attention Network for Ultra-high Resolution Remote Sensing Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address theseissues, we propose GLCANet (Global-Local Cross-Attention Network), alightweight segmentation framework designed for UHR remote sensingimagery.GLCANet employs a dual-stream architecture to efficiently fuse globalsemantics and local details while minimizing GPU usage. |
Chen Yi; Shan LianLei; | arxiv-cs.CV | 2025-06-24 |
| 432 | Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Open-Vocabulary Camouflaged Object Segmentation (OVCOS) seeks to segment andclassify camouflaged objects from arbitrary categories, presenting uniquechallenges due to visual ambiguity and unseen categories.Recent approachestypically adopt a two-stage paradigm: first segmenting objects, thenclassifying the segmented regions using Vision Language Models (VLMs). |
KAI ZHAO et. al. | arxiv-cs.CV | 2025-06-24 |
| 433 | DepthSeg: Depth Prompting in Remote Sensing Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a depth prompting two-dimensional (2D) remote sensing semantic segmentation framework (DepthSeg). |
NING ZHOU et. al. | arxiv-cs.CV | 2025-06-17 |
| 434 | Image Segmentation with Large Language Models: A Survey with Perspectives for Intelligent Transportation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Forintelligent transportation systems (ITS), where accurate scene understanding iscritical for safety and efficiency, this new paradigm offers unprecedentedcapabilities. This survey systematically reviews the emerging field ofLLM-augmented image segmentation, focusing on its applications, challenges, andfuture directions within ITS. |
Sanjeda Akter; Ibne Farabi Shihab; Anuj Sharma; | arxiv-cs.CV | 2025-06-16 |
| 435 | A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this survey, we present a holistic review of recent advances in VSP, covering a wide array of vision tasks, including Video Semantic Segmentation (VSS), Video Instance Segmentation (VIS), Video Panoptic Segmentation (VPS), as well as Video Tracking and Segmentation (VTS), and Open-Vocabulary Video Segmentation (OVVS). |
GUOHUAN XIE et. al. | arxiv-cs.CV | 2025-06-16 |
| 436 | InceptionMamba: Efficient Multi-Stage Feature Enhancement with Selective State Space Model for Microscopic Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, their reliance on the availability of large datasets for improved performance, along with the high computational cost, limit their practicality. To address these issues, we propose an efficient framework for the segmentation task, named InceptionMamba, which encodes multi-stage rich features and offers both performance and computational efficiency. |
DANIYA NAJIHA ABDUL KAREEM et. al. | arxiv-cs.CV | 2025-06-13 |
| 437 | Semantic Localization Guiding Segment Anything Model For Reference Remote Sensing Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current RRSIS methods rely on multi-modal fusion backbones and semantic segmentation heads but face challenges like dense annotation requirements and complex scene interpretation. To address these issues, we propose a framework named \textit{prompt-generated semantic localization guiding Segment Anything Model}(PSLG-SAM), which decomposes the RRSIS task into two stages: coarse localization and fine segmentation. |
Shuyang Li; Shuang Wang; Zhuangzhuang Sun; Jing Xiao; | arxiv-cs.CV | 2025-06-12 |
| 438 | Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces Symmetrical Flow Matching (SymmFlow), a new formulation that unifies semantic segmentation, classification, and image generation within a single model. |
Francisco Caetano; Christiaan Viviers; Peter H. N. De With; Fons van der Sommen; | arxiv-cs.CV | 2025-06-12 |
| 439 | Urban1960SatSeg: Unsupervised Semantic Segmentation of Mid-20$^{th}$ Century Urban Landscapes with Satellite Imageries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, severe quality degradation (e.g., distortion, misalignment, and spectral scarcity) and annotation absence have long hindered semantic segmentation on such historical RS imagery. To bridge this gap and enhance understanding of urban development, we introduce $\textbf{Urban1960SatBench}$, an annotated segmentation dataset based on historical satellite imagery with the earliest observation time among all existing segmentation datasets, along with a benchmark framework for unsupervised segmentation tasks, $\textbf{Urban1960SatUSM}$. |
TIANXIANG HAO et. al. | arxiv-cs.CV | 2025-06-11 |
| 440 | MSSDF: Modality-Shared Self-supervised Distillation for High-Resolution Multi-modal Remote Sensing Image Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, acquiring high-quality labeled data is often costly and time-consuming. To address this challenge, we proposes a multi-modal self-supervised learning framework that leverages high-resolution RGB images, multi-spectral data, and digital surface models (DSM) for pre-training. |
TONG WANG et. al. | arxiv-cs.CV | 2025-06-10 |
| 441 | Segment Any Architectural Facades (SAAF):An Automatic Segmentation Model for Building Facades, Walls and Windows Based on Multimodal Semantics Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes anautomatic segmentation model for building facade walls and windows based onmultimodal semantic guidance, called Segment Any Architectural Facades (SAAF). |
PEILIN LI et. al. | arxiv-cs.CV | 2025-06-09 |
| 442 | PIG: Physically-based Multi-Material Interaction with 3D Gaussians Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in a scene represented by 3D Gaussian primitives, interactions between objects suffer from inaccurate 3D segmentation, imprecise deformation among different materials, and severe rendering artifacts. To address these challenges, we introduce PIG: Physically-Based Multi-Material Interaction with 3D Gaussians, a novel approach that combines 3D object segmentation with the simulation of interacting objects in high precision. |
ZEYU XIAO et. al. | arxiv-cs.GR | 2025-06-09 |
| 443 | Enhanced 3D Scene Reconstruction with Semantic Understanding Using Synthetic Data and Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research presents an advanced pipeline for 2D and 3D scene reconstruction with semantic understanding. The synthetic data generation, semantic segmentation, and depth … |
A. Kuqi; Ambra Korra; I. Enesi; | 2025 20th Annual System of Systems Engineering Conference … | 2025-06-08 |
| 444 | SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces SUM Parts, the first large-scale dataset for urban textured meshes with part-level semantic labels, covering about 2.5km^2 with 21 classes. |
Weixiao Gao; Liangliang Nan; Hugo Ledoux; | cvpr | 2025-06-07 |
| 445 | A Semantic Knowledge Complementarity Based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose SKCDF, a semantic knowledge complementarity based decoupling framework for multi-organ segmentation in class-imbalanced medical images. |
ZHENG ZHANG et. al. | cvpr | 2025-06-07 |
| 446 | RelationField: Relate Anything in Radiance Fields IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, current method primarily focus on object-centric representations, supporting object segmentation or detection, while understanding semantic relationships between objects remains largely unexplored. To address this gap, we propose RelationField, the first method to extract inter-object relationships directly from neural radiance fields. |
SEBASTIAN KOCH et. al. | cvpr | 2025-06-07 |
| 447 | DocSAM: Unified Document Image Segmentation Via Query Decomposition and Heterogeneous Mixed Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Document image segmentation is crucial in document analysis and recognition but remains challenging due to the heterogeneity of document formats and diverse segmentation tasks. … |
Xiao-Hui Li; Fei Yin; Cheng-Lin Liu; | cvpr | 2025-06-07 |
| 448 | EntitySAM: Segment Everything in Video Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we introduce an entity decoder to facilitate inter-object communication and an automatic prompt generator using learnable object queries. |
Mingqiao Ye; Seoung Wug Oh; Lei Ke; Joon-Young Lee; | cvpr | 2025-06-07 |
| 449 | High Temporal Consistency Through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a lightweight video semantic segmentation approach–suited to onboard real-time inference–achieving high temporal consistency on aerial data through Semantic Similarity Propagation across frames. |
Cédric Vincent; Taehyoung Kim; Henri Meeß; | cvpr | 2025-06-07 |
| 450 | BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we revisit 3D semantic segmentation through a more granular lens, shedding light on subtle complexities that are typically overshadowed by broader performance metrics. |
Weiguang Zhao; Rui Zhang; Qiufeng Wang; Guangliang Cheng; Kaizhu Huang; | cvpr | 2025-06-07 |
| 451 | Dr. Splat: Directly Referring 3D Gaussian Splatting Via Direct Language Embedding Registration IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Dr. Splat, a novel approach for open-vocabulary 3D scene understanding leveraging 3D Gaussian Splatting. |
KIM JUN-SEONG et. al. | cvpr | 2025-06-07 |
| 452 | DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To improve the FSS pipeline, we propose a novel framework that utilizes large language models (LLMs) to adapt general class semantic information to the query image. |
Amin Karimi; Charalambos Poullis; | cvpr | 2025-06-07 |
| 453 | Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper explores scene affinity (AIScene), namely intra-scene consistency and inter-scene correlation, for semi-supervised LiDAR semantic segmentation in driving scenes. |
CHUANDONG LIU et. al. | cvpr | 2025-06-07 |
| 454 | Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While freezing the text encoder preserves its powerful embeddings, recent studies show that fine-tuning both the text and image encoders jointly significantly enhances segmentation performance, especially for classes from open sets. In this work, we explain this phenomenon from the perspective of hierarchical alignment, since during fine-tuning, the hierarchy level of image embeddings shifts from image-level to pixel-level. |
ZELIN PENG et. al. | cvpr | 2025-06-07 |
| 455 | Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces MPEC, a novel Masked Point-Entity Contrastive learning method for open-vocabulary 3D semantic segmentation that leverages both 3D entity-language alignment and point-entity consistency across different point cloud views to foster entity-specific feature representations. |
Yan Wang; Baoxiong Jia; Ziyu Zhu; Siyuan Huang; | cvpr | 2025-06-07 |
| 456 | MaSS13K: A Matting-level Semantic Segmentation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we build a large-scale, matting-level semantic segmentation dataset, named MaSS13K, which consists of 13,348 real-world images, all at 4K resolution. |
Chenxi Xie; Minghan Li; Hui Zeng; Jun Luo; Lei Zhang; | cvpr | 2025-06-07 |
| 457 | NightAdapter: Learning A Frequency Adapter for Generalizable Night-time Scene Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Night-time scene segmentation is a critical yet challenging task in the real-world applications, primarily due to the complicated lighting conditions. However, existing methods … |
QI BI et. al. | cvpr | 2025-06-07 |
| 458 | VidSeg: Training-free Video Semantic Segmentation Based on Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce the first training-free approach for Video Semantic Segmentation (VSS) based on pre-trained diffusion models. |
QIAN WANG et. al. | cvpr | 2025-06-07 |
| 459 | Scaling Up Image Segmentation Across Data and Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional segmentation models, while effective in isolated tasks, often fail to generalize to more complex and open-ended segmentation problems, such as free-form, open-vocabulary, and in-the-wild scenarios. To bridge this gap, we propose to scale up image segmentation across diverse datasets and tasks such that the knowledge across different tasks and datasets can be integrated while improving the generalization ability. |
PEI WANG et. al. | cvpr | 2025-06-07 |
| 460 | FFR: Frequency Feature Rectification for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we identify that attenuated high-frequency features mislead the decoder of ViT-based WSSS models, resulting in over-smoothed false segmentation. To address this, we propose a Frequency Feature Rectification (FFR) framework to rectify the false segmentations caused by attenuated high-frequency features and enhance the learning of high-frequency features in the decoder. |
Ziqian Yang; Xinqiao Zhao; Xiaolei Wang; Quan Zhang; Jimin Xiao; | cvpr | 2025-06-07 |
| 461 | Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we design Unified-Lift, a new end-to-end object-aware lifting approach that aims for high-quality 3D segmentation based on our object-aware 3D Gaussian representation. |
RUNSONG ZHU et. al. | cvpr | 2025-06-07 |
| 462 | An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose an end-to-end robust semantic Segmentation Network based on a Conditional-Noise Framework (CNF) of DDPMs, named CDSegNet. |
Wentao Qu; Jing Wang; YongShun Gong; Xiaoshui Huang; Liang Xiao; | cvpr | 2025-06-07 |
| 463 | Zero-Shot 4D Lidar Panoptic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the primary challenge in advancing research and developing generalized, versatile methods for spatio-temporal scene understanding in Lidar lies in the scarcity of datasets that provide the necessary diversity and scale of annotations. To overcome these challenges, we propose SAL-4D (Segment Anything in Lidar–4D), a method that utilizes multi-modal robotic sensor setups as a bridge to distill recent developments in Video Object Segmentation (VOS) in conjunction with off-the-shelf Vision-Language foundation models to Lidar. |
Yushan Zhang; Aljoša Ošep; Laura Leal-Taixé; Tim Meinhardt; | cvpr | 2025-06-07 |
| 464 | FALCON: Fairness Learning Via Contrastive Attention Approach to Continual Semantic Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work presents a novel Fairness Learning via Contrastive Attention Approach to continual learning in semantic scene understanding. |
Thanh-Dat Truong; Utsav Prabhu; Bhiksha Raj; Jackson Cothren; Khoa Luu; | cvpr | 2025-06-07 |
| 465 | Efficient Decoupled Feature 3D Gaussian Splatting Via Hierarchical Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing 3DGS-based methods embed both color and high-dimensional semantic features into a single field, leading to significant storage and computational overhead. To mitigate this, we propose Decoupled Feature 3D Gaussian Splatting (DF-3DGS), a novel method that decouples the color and semantic fields, thereby reducing the number of 3D Gaussians required for semantic representation. |
Zhenqi Dai; Ting Liu; Yanning Zhang; | cvpr | 2025-06-07 |
| 466 | PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose PSA-SSL, a novel extension to point cloud SSL that learns object pose and size-aware (PSA) features. |
Barza Nisar; Steven L. Waslander; | cvpr | 2025-06-07 |
| 467 | Segment Any Motion in Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel approach for moving object segmentation that combines long-range trajectory motion cues with DINO-based semantic features and leverages SAM2 for pixel-level mask densification through an iterative prompting strategy. |
NAN HUANG et. al. | cvpr | 2025-06-07 |
| 468 | Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a convex combination star (CCS) shape, possessing multi-center star shape properties, and has the advantage of effectively controlling the shape of the region through a smooth field function. |
Xinyu Zhao; Jun Xie; Shengzhe Chen; Jun Liu; | cvpr | 2025-06-07 |
| 469 | Benchmarking Large Vision-Language Models Via Directed Scene Graph for Comprehensive Image Captioning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce a detailed caption benchmark, termed as CompreCap, to evaluate the visual context from a directed scene graph view. |
FAN LU et. al. | cvpr | 2025-06-07 |
| 470 | Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Applying this pipeline to multiple 3D scene datasets, we create Mosaic3D-5.6M, a dataset of more than 30K annotated scenes with 5.6M mask-text pairs – significantly larger than existing datasets. Building on these data, we propose Mosaic3D, a 3D visiual foundation model (3D-VFM) combining a 3D encoder trained with contrastive learning and a lightweight mask decoder for open-vocabulary 3D semantic and instance segmentation. |
JUNHA LEE et. al. | cvpr | 2025-06-07 |
| 471 | A Dataset for Semantic Segmentation in The Presence of Unknowns Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing datasets allow evaluation of only either knowns or unknowns – but not both, which is required to establish "in the wild" suitability of deep neural network models. To bridge this gap, we propose a novel anomaly segmentation dataset, ISSU, featuring a diverse set of anomaly inputs from cluttered real-world environments. |
ZAKARIA LASKAR et. al. | cvpr | 2025-06-07 |
| 472 | COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Unlike existing approaches that remove ambiguous Gaussians and sacrifice visual quality, COB-GS, as a 3DGS refinement method, jointly optimizes semantic and visual information, allowing the two different levels to cooperate with each other effectively. Specifically, for the semantic guidance, we introduce a boundary-adaptive Gaussian splitting technique that leverages semantic gradient statistics to identify and split ambiguous Gaussians, aligning them closely with object boundaries. |
Jiaxin Zhang; Junjun Jiang; Youyu Chen; Kui Jiang; Xianming Liu; | cvpr | 2025-06-07 |
| 473 | Using Diffusion Priors for Video Amodal Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose to tackle video amodal segmentation by formulating it as a conditional generation task, thereby capitalizing on the foundational knowledge in video generative models. |
Kaihua Chen; Deva Ramanan; Tarasha Khurana; | cvpr | 2025-06-07 |
| 474 | Rethinking Semi-supervised Segmentation Beyond Accuracy: Reliability and Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These qualities, which ensure consistent performance under diverse conditions (robustness) and well-calibrated model confidences as well as meaningful uncertainties (reliability), are essential for safety-critical applications like autonomous driving, where models must handle unpredictable environments and avoid sudden failures at all costs. To address this gap, we introduce the Reliable Segmentation Score (RSS), a novel metric that combines predictive accuracy, calibration, and uncertainty quality measures via a harmonic mean. |
Steven Landgraf; Markus Hillemann; Markus Ulrich; | arxiv-cs.CV | 2025-06-06 |
| 475 | Real-Time Image Semantic Segmentation Based on Improved DeepLabv3+ Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To improve the performance of the image semantic segmentation algorithm and make the algorithm achieve a better balance between accuracy and real-time performance when segmenting … |
Peibo Li; Jiangwu Zhou; Xiaohua Xu; | Big Data Cogn. Comput. | 2025-06-06 |
| 476 | A Large-Scale Referring Remote Sensing Image Segmentation Dataset and Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing datasets for RRSIS suffer from critical limitations in resolution, scene diversity, and category coverage, which hinders the generalization and real-world applicability of refer segmentation models. To facilitate the development of this field, we introduce NWPU-Refer, the largest and most diverse RRSIS dataset to date, comprising 15,003 high-resolution images (1024-2048px) spanning 30+ countries with 49,745 annotated targets supporting single-object, multi-object, and non-object segmentation scenarios. |
ZHIGANG YANG et. al. | arxiv-cs.CV | 2025-06-04 |
| 477 | Talk2SAM: Text-Guided Semantic Enhancement for Complex-Shaped Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These models often struggle with thin structures and fine boundaries, leading to poor segmentation quality. We propose Talk2SAM, a novel approach that integrates textual guidance to improve segmentation of such challenging objects. |
Luka Vetoshkin; Dmitry Yudin; | arxiv-cs.CV | 2025-06-03 |
| 478 | 3DLST: 3D Learnable Supertoken Transformer for LiDAR Point Cloud Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Dening Lu; Linlin Xu; Jun Zhou; Kyle Gao; Jonathan Li; | Int. J. Appl. Earth Obs. Geoinformation | 2025-06-01 |
| 479 | A Real-time Semantic Segmentation Network Leveraging Spatial and Contextual Features for Enhanced Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Haifeng Sima; Meng Gao; Lanlan Liu; | Intell. Syst. Appl. | 2025-06-01 |
| 480 | AR-Light: Enabling Fast and Lightweight Multi-User Augmented Reality Via Semantic Segmentation and Collaborative View Synchronization Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multi-user Augmented Reality (MuAR) allows multiple users to interact with shared virtual objects, facilitated by exchanging environment information. Current MuAR systems rely on … |
YU WEN et. al. | IEEE Transactions on Computers | 2025-06-01 |
| 481 | HMFENet: Hierarchical Matching Guided Feature Enhancement Network for Few-Shot RGB-Thermal Urban Scene Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: RGB-Thermal semantic segmentation provides reliable support for intelligent traffic perception systems, such as road safety monitoring and autonomous driving perception, by fusing … |
XIANGYU ZHOU et. al. | IEEE Transactions on Intelligent Transportation Systems | 2025-06-01 |
| 482 | Thermal Image-guided Complementary Masking with Multiscale Fusion for Multi-spectral Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Zeyang Chen; Mingnan Hu; Bo Chen; | Eng. Appl. Artif. Intell. | 2025-06-01 |
| 483 | A Novel Hierarchical Generative Model for Semi-Supervised Semantic Segmentation of Biomedical Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In biomedical vision research, a significant challenge is the limited availability of pixel-wise labeled data. Data augmentation has been identified as a solution to this issue … |
Lu Chai; Zidong Wang; Yuheng Shao; Qinyuan Liu; | IEEE Transactions on Emerging Topics in Computational … | 2025-06-01 |
| 484 | WCMamba: Enhancing High-resolution Remote Sensing Image Semantic Segmentation with Pyramid Wavelet Convolution and SS2D IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chao Zhan; Kui Yang; | Knowl. Based Syst. | 2025-06-01 |
| 485 | Cascading Attention Enhancement Network for RGB-D Indoor Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
XU TANG et. al. | Comput. Vis. Image Underst. | 2025-06-01 |
| 486 | Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a unified, adaptive framework for automatic scene detection and keyframe selection that handles formats ranging from short-form media to long-form films, archival content, and surveillance footage. |
Vasilii Korolkov; | arxiv-cs.CV | 2025-05-31 |
| 487 | Bridging Classical and Modern Computer Vision: PerceptiveNet for Tree Crown Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Compared to the traditional methods, Deep Learning models improve accuracy by extracting informative and discriminative features, but often fall short in capturing the aforementioned complexities. To address these challenges, we propose PerceptiveNet, a novel model incorporating a Logarithmic Gabor-parameterised convolutional layer with trainable filter parameters, alongside a backbone that extracts salient features while capturing extensive context and spatial information through a wider receptive field. |
Georgios Voulgaris; | arxiv-cs.CV | 2025-05-29 |
| 488 | Federated Unsupervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Extending these ideas to federated settings requires feature representation and cluster centroid alignment across distributed clients — an inherently difficult task under heterogeneous data distributions in the absence of supervision. To address this, we propose FUSS Federated Unsupervised image Semantic Segmentation) which is, to our knowledge, the first framework to enable fully decentralized, label-free semantic segmentation training. |
Evangelos Charalampakis; Vasileios Mygdalis; Ioannis Pitas; | arxiv-cs.CV | 2025-05-29 |
| 489 | LiDAR Based Semantic Perception for Forklifts in Outdoor Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we present a novel LiDAR-based semantic segmentation framework tailored for autonomous forklifts operating in complex outdoor environments. |
Benjamin Serfling; Hannes Reichert; Lorenzo Bayerlein; Konrad Doll; Kati Radkhah-Lens; | arxiv-cs.RO | 2025-05-28 |
| 490 | Zero-Shot Pseudo Labels Generation Using SAM and CLIP for Semi-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this approach, the accuracy of the semantic segmentation model depends on the quality of the pseudo labels, and the quality of the pseudo labels depends on the performance of the model to be trained and the amount of data with annotated labels. In this paper, we generate pseudo labels using zero-shot annotation with the Segment Anything Model (SAM) and Contrastive Language-Image Pretraining (CLIP), improve the accuracy of the pseudo labels using the Unified Dual-Stream Perturbations Approach (UniMatch), and use them as enhanced labels to train a semantic segmentation model. |
Nagito Saito; Shintaro Ito; Koichi Ito; Takafumi Aoki; | arxiv-cs.CV | 2025-05-26 |
| 491 | ADD-SLAM: Adaptive Dynamic Dense SLAM with Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods that employ semantic segmentation or object detection for dynamic identification and filtering typically rely on predefined categorical priors, while discarding dynamic scene information crucial for robotic applications such as dynamic obstacle avoidance and environmental interaction. To overcome these challenges, we propose ADD-SLAM: an Adaptive Dynamic Dense SLAM framework based on Gaussian splitting. |
WENHUA WU et. al. | arxiv-cs.CV | 2025-05-25 |
| 492 | Semantic Segmentation with Reward Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Sometimes, we need a semantic segmentation network, and even a visual encoder can have a high compatibility, and can be trained using various types of feedback beyond traditional labels, such as feedback that indicates the quality of the parsing results. To tackle this issue, we proposed RSS (Reward in Semantic Segmentation), the first practical application of reward-based reinforcement learning on pure semantic segmentation offered in two granular levels (pixel-level and image-level). |
Xie Ting; Ye Huang; Zhilin Liu; Lixin Duan; | arxiv-cs.CV | 2025-05-23 |
| 493 | EMRA-proxy: Enhancing Multi-Class Region Semantic Segmentation in Remote Sensing Images with Attention Proxy Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: High-resolution remote sensing (HRRS) image segmentation is challenging due to complex spatial layouts and diverse object appearances. While CNNs excel at capturing local … |
YICHUN YU et. al. | arxiv-cs.CV | 2025-05-23 |
| 494 | SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To facilitate research towards robust model design in segmentation and detection, our primary objective is to provide benchmarking tools regarding robustness to distribution shifts and adversarial manipulations. |
SHASHANK AGNIHOTRI et. al. | arxiv-cs.CV | 2025-05-23 |
| 495 | TextureSAM: Towards A Texture Aware Foundation Model for Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we investigate SAM’s bias toward semantics over textures and introduce a new texture-aware foundation model, TextureSAM, which performs superior segmentation in texture-dominant scenarios. |
Inbal Cohen; Boaz Meivar; Peihan Tu; Shai Avidan; Gal Oren; | arxiv-cs.CV | 2025-05-22 |
| 496 | From Pixels to Images: Deep Learning Advances in Remote Sensing Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This review offers a holistic view of DL-based SS for RS, highlighting key advancements, comparative insights, and open challenges to guide future research. |
Quanwei Liu; Tao Huang; Yanni Dong; Jiaqi Yang; Wei Xiang; | arxiv-cs.CV | 2025-05-21 |
| 497 | Multi-View Projection for Unsupervised Domain Adaptation in 3D Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a multi-view projectionframework for unsupervised domain adaptation (UDA). |
Andrew Caunes; Thierry Chateau; Vincent Fremont; | arxiv-cs.CV | 2025-05-21 |
| 498 | Scan, Materialize, Simulate: A Generalizable Framework for Physically Grounded Robot Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present Scan, Materialize, Simulate (SMS), a unified framework that combines 3D Gaussian Splatting for accurate scene reconstruction, visual foundation models for semantic segmentation, vision-language models for material property inference, and physics simulation for reliable prediction of action outcomes. |
Amine Elhafsi; Daniel Morton; Marco Pavone; | arxiv-cs.RO | 2025-05-20 |
| 499 | LE-Object: Language Embedded Object-Level Neural Radiance Fields for Open-Vocabulary Scene Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent advancements in Visual Language Models (VLMs) have significantly driven research in open-vocabulary 3D scene reconstruction, showcasing strong potential in open-set … |
Mengting Wang; Yunzhou Zhang; Xingshuo Wang; Zhiyao Zhang; Zhiteng Li; | 2025 IEEE International Conference on Robotics and … | 2025-05-19 |
| 500 | Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, a Shape-Aware Efficient Network (SPENet) is proposed, which focuses on the shapes of objects to achieve excellent segmentation consistency by separately supervising the extraction of boundary and body information from images. |
Guoxuan Mao; Ting Cao; Ziyang Li; Yuan Dong; | arxiv-cs.CV | 2025-05-19 |
| 501 | MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we explore the potential of a pure visual foundation model as an alternative to widely used vision-language models for universal visual anomaly segmentation. |
Bin-Bin Gao; | arxiv-cs.CV | 2025-05-14 |
| 502 | MESSI: A Multi-Elevation Semantic Segmentation Image Dataset of An Urban Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a Multi-Elevation Semantic Segmentation Image (MESSI) dataset comprising 2525 images taken by a drone flying over dense urban environments. |
Barak Pinkovich; Boaz Matalon; Ehud Rivlin; Hector Rotstein; | arxiv-cs.CV | 2025-05-13 |
| 503 | Efficient Semantic Segmentation Via Advanced Prompt Tuning Techniques Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is a fundamental task in computer vision with critical applications in autonomous driving, medical imaging, and remote sensing. However, optimizing … |
Rima Hasna Yamouni; Rim Trabelsi; A. Cabani; F. Abdelkefi; | 2025 International Wireless Communications and Mobile … | 2025-05-12 |
| 504 | Method for Semantic Image Segmentation Based on The Neural Network with Gabor Filters Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
E. Murin; D. V. Sorokin; A. S. Krylov; | Program. Comput. Softw. | 2025-05-12 |
| 505 | Technical Report for ICRA 2025 GOOSE 2D Semantic Segmentation Challenge: Leveraging Color Shift Correction, RoPE-Swin Backbone, and Quantile-based Label Denoising Strategy for Robust Outdoor Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This report presents our semantic segmentation framework developed by team ACVLAB for the ICRA 2025 GOOSE 2D Semantic Segmentation Challenge, which focuses on parsing outdoor scenes into nine semantic categories under real-world conditions. |
CHIH-CHUNG HSU et. al. | arxiv-cs.CV | 2025-05-11 |
| 506 | Boosting Cross-spectral Unsupervised Domain Adaptation for Thermal Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a comprehensive study on cross-spectral UDA for thermal image semantic segmentation. |
Seokjun Kwon; Jeongmin Shin; Namil Kim; Soonmin Hwang; Yukyung Choi; | arxiv-cs.CV | 2025-05-11 |
| 507 | MultiTaskVIF: Segmentation-oriented Visible and Infrared Image Fusion Via Multi-task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing segmentation-oriented VIF methods adopt a cascade structure comprising separate fusion and segmentation models, leading to increased network complexity and redundancy. This raises a critical question: can we design a more concise and efficient structure to integrate semantic information directly into the fusion model during training-Inspired by multi-task learning, we propose a concise and universal training framework, MultiTaskVIF, for segmentation-oriented VIF models. |
Zixian Zhao; Andrew Howes; Xingchen Zhang; | arxiv-cs.CV | 2025-05-10 |
| 508 | CLIMS++: Cross Language Image Matching with Automatic Context Discovery for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JINHENG XIE et. al. | Int. J. Comput. Vis. | 2025-05-09 |
| 509 | Segment Any RGB-Thermal Model with Language-aided Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given that RGB-T provides a robust solution for scene understanding in adverse weather and lighting conditions, such as low light and overexposure, we propose a novel framework, SARTM, which customizes the powerful SAM for RGB-T semantic segmentation. |
DONG XING et. al. | arxiv-cs.CV | 2025-05-03 |
| 510 | Parallel Segmentation Network for Real-time Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Guanke Chen; Haibin Li; Yaqian Li; Wenming Zhang; Tao Song; | Eng. Appl. Artif. Intell. | 2025-05-01 |
| 511 | SegTrackDetect: A Window-based Framework for Tiny Object Detection Via Semantic Segmentation and Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Aleksandra Kos; Karol Majek; Dominik Belter; | SoftwareX | 2025-05-01 |
| 512 | NTRENet++: Unleashing The Power of Non-Target Knowledge for Few-Shot Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Few-shot semantic segmentation (FSS) aims to segment the target object under the condition of a few annotated samples. However, current studies on FSS primarily concentrate on … |
YUANWEI LIU et. al. | IEEE Transactions on Circuits and Systems for Video … | 2025-05-01 |
| 513 | Improving RGB-Thermal Semantic Scene Understanding With Synthetic Data Augmentation for Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic scene understanding is an important capability for autonomous vehicles. Despite recent advances in RGB-Thermal (RGB-T) semantic segmentation, existing methods often rely … |
Haotian Li; H. K. Chu; Yuxiang Sun; | IEEE Robotics and Automation Letters | 2025-05-01 |
| 514 | Dynamic Mutual Training Semi-supervised Semantic Segmentation Algorithm with Adaptive Capability (AD-DMT) for Choy Sum Stem Segmentation and 3D Positioning of Cutting Points Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
KAI YUAN et. al. | Comput. Electron. Agric. | 2025-05-01 |
| 515 | Mamba Based Feature Extraction And Adaptive Multilevel Feature Fusion For 3D Tumor Segmentation From Multi-modal Medical Image Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Mamba based feature extraction and adaptive multilevel feature fusion for 3D tumor segmentation using multi-modal medical image. |
ZEXIN JI et. al. | arxiv-cs.CV | 2025-04-29 |
| 516 | A Data-Centric Approach to 3D Semantic Segmentation of Railway Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces two targeted data augmentation methods designed to improve segmentation performance on the railway-specific OSDaR23 dataset. |
NICOLAS MÜNGER et. al. | arxiv-cs.CV | 2025-04-25 |
| 517 | SAIP-Net: Enhancing Remote Sensing Image Segmentation Via Spectral Adaptive Information Propagation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address limitations arising from spatial domain featurefusion and insufficient receptive fields, this paper introduces SAIP-Net, anovel frequency-aware segmentation framework that leverages Spectral AdaptiveInformation Propagation. |
Zhongtao Wang; Xizhe Cao; Yisong Chen; Guoping Wang; | arxiv-cs.CV | 2025-04-23 |
| 518 | Occlusion-Ordered Semantic Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose to solve the joint task of relative depth ordering and segmentation of instances based on occlusions. |
Soroosh Baselizadeh; Cheuk-To Yu; Olga Veksler; Yuri Boykov; | arxiv-cs.CV | 2025-04-18 |
| 519 | Lightweight Road Environment Segmentation Using Vector Quantization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: (3) Vector quantization encourages the latent space to form coarse clusters of continuous features, forcing the model to group similar features, making the learned representations more structured for the decoding process. In this work, we combined vector quantization with the lightweight image segmentation model MobileUNETR and used it as a baseline model for comparison to demonstrate its efficiency. |
Jiyong Kwag; Alper Yilmaz; Charles Toth; | arxiv-cs.CV | 2025-04-18 |
| 520 | DC-SAM: In-Context Segment Anything in Images and Videos Via Dual Consistency Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose the Dual Consistency SAM (DC-SAM) method based on prompt-tuning to adapt SAM and SAM2 for in-context segmentation of both images and videos. |
MENGSHI QI et. al. | arxiv-cs.CV | 2025-04-16 |
| 521 | A Weakly Supervised Semantic Segmentation Model with Enhanced CLIP Feature Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper addresses the limitations of the Contrastive Language-Image Pre-training (CLIP) model’s image encoder and proposes a segmentation model WSSS-ECFE with enhanced CLIP feature extraction, aiming to improve the performance of the Weakly Supervised Semantic Segmentation (WSSS) task. |
F. Kong; J. Lu; | icassp | 2025-04-15 |
| 522 | U-SAM: Upgrade Segment Anything Model With Semantic-Aware and Memory-Efficient Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: (2) SAM’s inefficient use of instance-independent visual features and tokens necessitates maintaining unique features and tokens for each instance, leading to excessive GPU memory consumption and diminished segmentation efficiency. To address these issues, we propose the Universal Segment Anything Model (U-SAM), a semantic-aware and memory-efficient segmentation model designed to perform both promptable and traditional segmentation tasks within a compact and unified framework. |
X. Jin; J. Hu; J. Lin; S. Zhang; L. Cao; | icassp | 2025-04-15 |
| 523 | Text-Guided Few-Shot Semantic Segmentation with Training-Free Multimodal Feature Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a training-free approach using multimodal feature matching that performs segmentation by identifying regions in a target image that match the features from both the image and text references. |
G. Buthmann; T. Sakai; H. Qiu; T. Katsuki; D. Kimura; | icassp | 2025-04-15 |
| 524 | PraNet-V2: Dual-Supervised Reverse Attention for Medical Image Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, PraNet-V1 struggles with multi-class segmentation tasks. To address this limitation, we propose PraNet-V2, which, compared to PraNet-V1, effectively performs a broader range of tasks including multi-class segmentation. |
Bo-Cheng Hu; Ge-Peng Ji; Dian Shao; Deng-Ping Fan; | arxiv-cs.CV | 2025-04-15 |
| 525 | FCoDT-Net: A Novel Framework for High-Precision Medical Image Segmentation Using Contextual Distillation Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The unused information leads to suboptimal segmentation results. In this paper, we propose the Feature Context Distillation Transformer Network (FCoDT-Net), a deep learning model designed to address these limitations by leveraging the rich contextual information within the skip connections. |
Q. YuTao; Y. SiZhe; H. Bang; R. Wei; | icassp | 2025-04-15 |
| 526 | Harnessing Light Field Angular Cues and Spatial Geometries for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel backbone network called the Light Field Extraction Interaction Network (LFEI-Net). |
C. Jia; F. Shi; X. Cheng; | icassp | 2025-04-15 |
| 527 | Dual-Path Consistency Unsupervised Domain Adaptation for Nighttime Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, it is often hindered by the lack of annotations due to interference caused by inadequate lighting or exposure. To overcome these difficulties, we propose a Dual-Path Consistency (DPC) unsupervised domain adaptation (UDA) approach. |
Y. Lu; J. Lang; M. Ding; | icassp | 2025-04-15 |
| 528 | PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in The Wild Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This report provides a comprehensive overview of the 4th Pixel-level Video Understanding in the Wild (PVUW) Challenge, held in conjunction with CVPR 2025. |
HENGHUI DING et. al. | arxiv-cs.CV | 2025-04-15 |
| 529 | ES-NeRF: Enhancing Segmentation in NeRF with CLIP Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they face the challenge of accurately and consistently segmenting objects in complex scenarios. To address this issue, we introduce the Enhancing Segmentation in NeRF with CLIP(ES-NeRF), which aims to improve the segmentation quality through feature fusion with the help of CLIP’s powerful semantic comprehension. |
C. ZHAO et. al. | icassp | 2025-04-15 |
| 530 | Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods require massive parameter updates and computational effort during the feature extraction and fusion. To address this issue, we propose a novel multimodal fusion network (EFNet) based on an early fusion strategy and a simple but effective feature clustering for training efficient RGB-T semantic segmentation. |
Z. Shen; Y. Li; H. Zhang; Y. Weng; J. Wang; | icassp | 2025-04-15 |
| 531 | Joint Semantic Segmentation of Optical and SAR Image in Hazy Environments Via Cross-modal Information Rectification and Cross-attention Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a joint semantic segmentation of optical and SAR in hazy environments network that incorporates channel fusion for feature enhancement and cross-attention for feature fusion, enabling efficient segmentation of hazy optical images. |
X. Fan; L. Zhang; | icassp | 2025-04-15 |
| 532 | SPT: Sequence Prompt Transformer for Interactive Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods typically process one image at a time, failing to consider the sequential nature of the images. To overcome this limitation, we propose a novel method called Sequence Prompt Transformer (SPT), the first to utilize sequential image information for interactive segmentation. |
S. Cheng; | icassp | 2025-04-15 |
| 533 | UMSSS: A Visual Scene Semantic Segmentation Dataset for Underground Mines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a challenging semantic segmentation dataset focusing on underground mines, named the underground mine scenes semantic segmentation (UMSSS) dataset, which contains 4200 high-quality annotated images and 18 annotated categories. |
J. Wang; | icassp | 2025-04-15 |
| 534 | Hazy Remote Sensing Image Semantic Segmentation with Weak Annotations Via Pre-training Optimization and Co-training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the numerous haze removal methods developed for remote sensing images, their efficacy in the subsequent task of semantic segmentation remains inadequate. To address these issues, this paper aims to enhance the robustness of the segmentation network against haze interference by proposing a weakly supervised semantic segmentation framework based on pre-training optimization and dual-network co-training. |
J. Xu; L. Zhang; | icassp | 2025-04-15 |
| 535 | MASSeg : 2nd Technical Report for 4th PVUW MOSE Track Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This report presents our solution, which ranked second in the MOSE track of CVPR 2025 PVUW Challenge. |
XUQIANG CAO et. al. | arxiv-cs.CV | 2025-04-14 |
| 536 | IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework Under Limited Annotation Scheme Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods struggle to balance global semantic representation with fine-grained local feature extraction. To address this challenge, we propose a novel tri-branch semi-supervised segmentation framework incorporating a dual-teacher strategy, named IGL-DT. |
DINH DAI QUAN TRAN et. al. | arxiv-cs.CV | 2025-04-13 |
| 537 | AerOSeg: Harnessing SAM for Open-Vocabulary Segmentation in Remote Sensing Images IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This necessitates the development of OVS approaches specifically tailored for remote sensing. In this context, we propose AerOSeg, a novel OVS approach for remote sensing data. |
Saikat Dutta; Akhil Vasim; Siddhant Gole; Hamid Rezatofighi; Biplab Banerjee; | arxiv-cs.CV | 2025-04-12 |
| 538 | ChildlikeSHAPES: Semantic Hierarchical Region Parsing for Animating Figure Drawings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This semantic understanding is a crucial prerequisite for animation tools that seek to modify figures while preserving their unique style. To help achieve this, we propose a novel hierarchical segmentation model, built upon the architecture and pre-trained SAM, to quickly and accurately obtain these semantic labels. |
Astitva Srivastava; Harrison Jesse Smith; Thu Nguyen-Phuoc; Yuting Ye; | arxiv-cs.GR | 2025-04-10 |
| 539 | PathSegDiff: Pathology Segmentation Using Diffusion Model Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors. |
Sachin Kumar Danisetty; Alexandros Graikos; Srikar Yellapragada; Dimitris Samaras; | arxiv-cs.CV | 2025-04-09 |
| 540 | MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep Thinking Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, segmenting moving objects from a single image remains challenging for existing methods due to the absence of temporal cues. To address this gap, we propose MovSAM, the first framework for single-image moving object segmentation. |
CHANG NIE et. al. | arxiv-cs.CV | 2025-04-09 |
| 541 | InvNeRF-Seg: Fine-Tuning A Pre-Trained NeRF for 3D Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose Invariant NeRF for Segmentation (InvNeRFSeg), a two step, zero change fine tuning strategy for 3D segmentation. |
Jiangsan Zhao; Jakob Geipel; Krzysztof Kusnierek; Xuean Cui; | arxiv-cs.CV | 2025-04-08 |
| 542 | CCANet: Cross-Modality Comprehensive Feature Aggregation Network for Indoor Scene Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The semantic segmentation of indoor scenes based on RGB and depth information has been a persistent and enduring research topic. However, how to fully utilize the complementarity … |
ZHANG ZIHAO et. al. | IEEE Transactions on Cognitive and Developmental Systems | 2025-04-01 |
| 543 | Zero-Shot 4D Lidar Panoptic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the primary challenge in advancing research and developing generalized, versatile methods for spatio-temporal scene understanding in Lidar lies in the scarcity of datasets that provide the necessary diversity and scale of annotations.To overcome these challenges, we propose SAL-4D (Segment Anything in Lidar–4D), a method that utilizes multi-modal robotic sensor setups as a bridge to distill recent developments in Video Object Segmentation (VOS) in conjunction with off-the-shelf Vision-Language foundation models to Lidar. We utilize VOS models to pseudo-label tracklets in short video sequences, annotate these tracklets with sequence-level CLIP tokens, and lift them to the 4D Lidar space using calibrated multi-modal sensory setups to distill them to our SAL-4D model. |
Yushan Zhang; Aljoša Ošep; Laura Leal-Taixé; Tim Meinhardt; | arxiv-cs.CV | 2025-04-01 |
| 544 | Domain-Incremental Semantic Segmentation for Traffic Scenes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Traffic scene segmentation is an important visual perception process to provide strong support for the decision-making of autonomous driving systems. The traffic scene is an open … |
Yazhou Liu; Haoqi Chen; P. Lasang; Zheng Wu; | IEEE Transactions on Intelligent Transportation Systems | 2025-04-01 |
| 545 | Combining Feature Compensation and GCN-based Reconstruction for Multimodal Remote Sensing Image Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Zhen Wang; Jiayuan Li; Nan Xu; Zhuhong You; | Inf. Fusion | 2025-04-01 |
| 546 | HSPFormer: Hierarchical Spatial Perception Transformer for Semantic Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Semantic perception in driving scenarios plays a crucial role in intelligent transportation systems. However, existing Transformer-based semantic segmentation methods often do not … |
SIYU CHEN et. al. | IEEE Transactions on Intelligent Transportation Systems | 2025-04-01 |
| 547 | Hierarchical Context Learning of Object Components for Unsupervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save |
Dong Bao; Jun Zhou; Gervase Tuxworth; Jue Zhang; Yongsheng Gao; | Pattern Recognit. | 2025-04-01 |
| 548 | Knowledge Distillation for Reduced Footprint Semantic Segmentation with The U-Net Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Model compression techniques such as knowledge distillation, pruning, and quantization are well documented in the computer vision literature for image classification and … |
Ciro Rosa; Nina Hirata; | Proceedings of the 40th ACM/SIGAPP Symposium on Applied … | 2025-03-31 |
| 549 | Improving Underwater Semantic Segmentation with Underwater Image Quality Attention and Muti-scale Aggregation Attention IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, the low illumination in underwater environments degrades the imaging quality, which in turn seriously deteriorates the performance of underwater semantic segmentation, particularly for outlining the object region boundaries. To tackle this issue, we present UnderWater SegFormer (UWSegFormer), a transformer-based framework for semantic segmentation of low-quality underwater images. |
Xin Zuo; Jiaran Jiang; Jifeng Shen; Wankou Yang; | arxiv-cs.CV | 2025-03-30 |
| 550 | Open-Vocabulary Semantic Segmentation with Uncertainty Alignment for Robotic Scene Understanding in Indoor Building Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, most existing methods ignore the uncertainty of the scene recognition problem, leading to low success rates, particularly in ambiguous and complex environments. To address these challenges, we propose an open-vocabulary scene semantic segmentation and detection pipeline leveraging Vision Language Models (VLMs) and Large Language Models (LLMs). |
Yifan Xu; Vineet Kamat; Carol Menassa; | arxiv-cs.CV | 2025-03-29 |
| 551 | A Dataset for Semantic Segmentation in The Presence of Unknowns Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing datasets allow evaluation of only knowns or unknowns – but not both, which is required to establish in the wild suitability of deep neural network models. To bridge this gap, we propose a novel anomaly segmentation dataset, ISSU, that features a diverse set of anomaly inputs from cluttered real-world environments. |
ZAKARIA LASKAR et. al. | arxiv-cs.CV | 2025-03-28 |
| 552 | Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, it often overfits and memorizes training data, limiting their ability to generate diverse and well-aligned samples. To overcome these issues, we propose Concept-Aware LoRA (CA-LoRA), a novel fine-tuning approach that selectively identifies and updates only the weights associated with necessary concepts (e.g., style or viewpoint) for domain alignment while preserving the pretrained knowledge of the T2I model to produce informative samples. |
MINHO PARK et. al. | arxiv-cs.CV | 2025-03-28 |
| 553 | Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a novel approach able to generate 3D semantic scene-scale data without relying on any projection or decoupled trained multi-resolution models, achieving more realistic semantic scene data generation compared to previous state-of-the-art methods. |
Lucas Nunes; Rodrigo Marcuzzi; Jens Behley; Cyrill Stachniss; | arxiv-cs.CV | 2025-03-27 |
| 554 | A Deep Learning Framework for Boundary-Aware Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, they still struggle with blurred target boundaries and insufficient recognition of small targets. To address these issues, this study proposes a Mask2Former-based semantic segmentation algorithm incorporating a boundary enhancement feature bridging module (BEFBM). |
TAI AN et. al. | arxiv-cs.CV | 2025-03-27 |
| 555 | OpenLex3D: A Tiered Evaluation Benchmark for Open-Vocabulary 3D Scene Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By introducing an open-set 3D semantic segmentation task andan object retrieval task, we evaluate various existing 3D open-vocabularymethods on OpenLex3D, showcasing failure cases, and avenues for improvement.Our experiments provide insights on feature precision, segmentation, anddownstream capabilities. |
CHRISTINA KASSAB et. al. | arxiv-cs.CV | 2025-03-25 |
| 556 | The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We benchmark a wide range of semantic segmentation models, and find that transfer learning from Coralscapes to existing smaller datasets consistently leads to state-of-the-art performance. |
JONATHAN SAUDER et. al. | arxiv-cs.CV | 2025-03-25 |
| 557 | OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: 3D scene understanding has been transformed by open-vocabulary language models that enable interaction via natural language. However, the evaluation of these representations is … |
CHRISTINA KASSAB et. al. | ArXiv | 2025-03-25 |
| 558 | BIMII-Net: Brain-Inspired Multi-Iterative Interactive Network for RGB-T Road Scene Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Nevertheless, existing RGB-T semantic segmentation models typically depend on simple addition or concatenation strategies or ignore the differences between information at different levels. To address these issues, we proposed a novel RGB-T road scene semantic segmentation network called Brain-Inspired Multi-Iteration Interaction Network (BIMII-Net). |
Hanshuo Qiu; Jie Jiang; Ruoli Yang; Lixin Zhan; Jizhao Liu; | arxiv-cs.CV | 2025-03-24 |
| 559 | Context-Aware Semantic Segmentation: Enhancing Pixel-Level Understanding with Large Language Models for Advanced Vision Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current models, such as CNN and Transformer-based architectures, excel at identifying pixel-level features but fail to distinguish semantically similar objects (e.g., doctor vs. nurse in a hospital scene) or understand complex contextual scenarios (e.g., differentiating a running child from a regular pedestrian in autonomous driving). To address these limitations, we proposed a novel Context-Aware Semantic Segmentation framework that integrates Large Language Models (LLMs) with state-of-the-art vision backbones. |
Ben Rahman; | arxiv-cs.CV | 2025-03-24 |
| 560 | Seg2Box: 3D Object Detection By Point-Wise Semantics Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the challenge arises due to the incomplete geometry structure and boundary ambiguity of point-cloud instances, leading to inaccurate pseudo labels and poor detection results. To address these challenges, we propose a novel method, named Seg2Box. |
MAOJI ZHENG et. al. | arxiv-cs.CV | 2025-03-20 |
| 561 | SPNeRF: Open Vocabulary 3D Neural Scene Segmentation with Superpoints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Extending these capabilities to 3D segmentation introduces challenges, as CLIP’s image-based embeddings often lack the geometric detail necessary for 3D scene segmentation. Recent methods tend to address this by introducing additional segmentation models or replacing CLIP with variations trained on segmentation data, which lead to redundancy or loss on CLIP’s general language capabilities. |
WEIWEN HU et. al. | arxiv-cs.CV | 2025-03-19 |
| 562 | High Temporal Consistency Through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a lightweight videosemantic segmentation approach-suited to onboard real-time inference-achievinghigh temporal consistency on aerial data through Semantic SimilarityPropagation across frames. |
Cédric Vincent; Taehyoung Kim; Henri Meeß; | arxiv-cs.CV | 2025-03-19 |
| 563 | USAM-Net: A U-Net-based Network for Improved Stereo Correspondence and Scene Depth Estimation Using Features from A Pre-trained Image Segmentation Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The increasing demand for high-accuracy depth estimation in autonomous driving and augmented reality applications necessitates advanced neural architectures capable of effectively leveraging multiple data modalities. In this context, we introduce the Unified Segmentation Attention Mechanism Network (USAM-Net), a novel convolutional neural network that integrates stereo image inputs with semantic segmentation maps and attention to enhance depth estimation performance. |
Joseph Emmanuel DL Dayo; Prospero C. Naval Jr; | arxiv-cs.CV | 2025-03-19 |
| 564 | 3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We accordingly propose the \textit{3D-AffordanceLLM} (3D-ADLLM), a framework designed for reasoning affordance detection in 3D open-scene. |
HENGSHUO CHU et. al. | iclr | 2025-03-17 |
| 565 | Adaptive Transformer Attention and Multi-Scale Fusion for Spine 3D Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study proposes a 3D semantic segmentation method for the spine based on the improved SwinUNETR to improve segmentation accuracy and robustness. |
YANLIN XIANG et. al. | arxiv-cs.CV | 2025-03-17 |
| 566 | Adaptive Transformer Attention and Multi -Scale Fusion for Spine 3D Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This study proposes a 3D semantic segmentation method for the spine based on the improved SwinUNETR to improve segmentation accuracy and robustness. Aiming at the complex … |
YANLIN XIANG et. al. | 2025 5th International Conference on Artificial … | 2025-03-17 |
| 567 | Text4Seg: Reimagining Image Segmentation As Text Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce Text4Seg, a novel text-as-mask paradigm that casts image segmentation as a text generation problem, eliminating the need for additional decoders and significantly simplifying the segmentation process. |
MENGCHENG LAN et. al. | iclr | 2025-03-17 |
| 568 | DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these models often struggle with domain-specific nuances and underrepresented fine-grained categories. To address these challenges, we introduce DynAlign, a two-stage framework that integrates UDA with foundation models to bridge both the image-level and label-level domain gaps. |
Han Sun; Rui Gong; Ismail Nejjar; Olga Fink; | iclr | 2025-03-17 |
| 569 | Clustering Is Back: Reaching State-of-the-art LiDAR Instance Segmentation Without Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we demonstrate that competitive panoptic segmentation can be achieved using only semantic labels, with instances predicted without any training or annotations. |
Corentin Sautier; Gilles Puy; Alexandre Boulch; Renaud Marlet; Vincent Lepetit; | arxiv-cs.CV | 2025-03-17 |
| 570 | Class Distribution-induced Attention Map for Open-vocabulary Semantic Segmentations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we argue that CLIP-based prior works yield patch-wise noisy class predictions while having highly correlated class distributions for each object. |
Dong Un Kang; Hayeon Kim; Se Young Chun; | iclr | 2025-03-17 |
| 571 | Point Cloud Based Scene Segmentation: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To inspire future research, in this review paper, we provide a comprehensive overview of the current state-of-the-art methods in the field of Point Cloud Semantic Segmentation for autonomous driving. |
Dan Halperin; Niklas Eisl; | arxiv-cs.CV | 2025-03-16 |
| 572 | LangDA: Building Context-Awareness Via Language for Domain Adaptive Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Two key approaches in DASS are (1) vision-only approaches using masking or multi-resolution crops, and (2) language-based approaches that use generic class-wise prompts informed by target domain (e.g. a {snowy} photo of a {class}). |
CHANG LIU et. al. | arxiv-cs.CV | 2025-03-16 |
| 573 | OSMa-Bench: Evaluating Open Semantic Mapping Under Varying Lighting Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces adynamically configurable and highly automated LLM/LVLM-powered pipeline forevaluating OSM solutions called OSMa-Bench (Open Semantic Mapping Benchmark). |
Maxim Popov; Regina Kurkova; Mikhail Iumanov; Jaafar Mahmoud; Sergey Kolyubin; | arxiv-cs.CV | 2025-03-13 |
| 574 | Entropy Guidance Hierarchical Rich-scale Feature Network for Remote Sensing Image Semantic Segmentation of High Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
HAOXUE ZHANG et. al. | Appl. Intell. | 2025-03-13 |
| 575 | Zero-shot Image Segmentation for Scene Objects Based on The L0 Gradient Minimization and Adaptive Superpixel Method Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Hailong Yan; Junjiang Huang; Mao Zheng; Yijie Tang; | Neural Comput. Appl. | 2025-03-12 |
| 576 | MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Low-resolution image segmentation is crucial in real-world applications such as robotics, augmented reality, and large-scale scene understanding, where high-resolution data is often unavailable due to computational constraints. To address this challenge, we propose MaskAttn-UNet, a novel segmentation framework that enhances the traditional U-Net architecture via a mask attention mechanism. |
ANZHE CHENG et. al. | arxiv-cs.CV | 2025-03-11 |
| 577 | Approximate Size Targets Are Sufficient for Accurate Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our ideas are validated on PASCAL VOC using our new human annotations of approximate object sizes. |
Xingye Fan; Yuri Boykov; | arxiv-cs.CV | 2025-03-10 |
| 578 | OmniSAM: Omnidirectional Segment Anything Model for UDA in Panoramic Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Two major concerns for this application includes 1)inevitable distortion and object deformation brought by the large FoV disparitybetween domains; 2) the lack of pixel-level semantic understanding that theoriginal SAM2 cannot provide. To address these issues, we propose a novelOmniSAM framework, which makes the first attempt to apply SAM2 for panoramicsemantic segmentation. |
DING ZHONG et. al. | arxiv-cs.CV | 2025-03-10 |
| 579 | Aligning Instance-Semantic Sparse Representation Towards Unsupervised Object Segmentation and Shape Abstraction with Repeatable Primitives Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Driven by the tendency of high-dimensional semantically similar features to lie in or near low-dimensional subspaces, we introduce a one-stage, fully unsupervised framework towards semantic-aware shape representation. |
Jiaxin Li; Hongxing Wang; Jiawei Tan; Zhilong Ou; Junsong Yuan; | arxiv-cs.CV | 2025-03-10 |
| 580 | MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Inspired by cross-frame correlation in videos, we propose to treat multi-modal data as a sequence of frames representing the same scene. |
CHENFEI LIAO et. al. | arxiv-cs.CV | 2025-03-09 |
| 581 | Dynamically Evolving Segment Anything Model with Continuous Learning for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in practical applications, the diversity of scenarios and tasks in medical image segmentation continues to expand, necessitating models that can dynamically evolve to meet the demands of various segmentation tasks. Here, we introduce EvoSAM, a dynamically evolving medical image segmentation model that continuously accumulates new knowledge from an ever-expanding array of scenarios and tasks, enhancing its segmentation capabilities. |
ZHAORI LIU et. al. | arxiv-cs.CV | 2025-03-08 |
| 582 | EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing mapping methods often suffer fromoverconfident semantic predictions, and sparse and noisy depth sensing, leadingto inconsistent map representations. In this paper, we therefore introduceEvidMTL, a multi-task learning framework that uses evidential heads for depthestimation and semantic segmentation, enabling uncertainty-aware inference frommonocular RGB images. |
Rohit Menon; Nils Dengler; Sicong Pan; Gokul Krishna Chenchani; Maren Bennewitz; | arxiv-cs.RO | 2025-03-06 |
| 583 | BEVMOSNet: Multimodal Fusion for BEV Moving Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Conversely, LiDAR and radar sensors remain almost unaffected in these scenarios, and radar provides key velocity information of the objects. Therefore, we introduce BEVMOSNet, to our knowledge, the first end-to-end multimodal fusion leveraging cameras, LiDAR, and radar to precisely predict the moving objects in BEV. |
HIEP TRUONG CONG et. al. | arxiv-cs.CV | 2025-03-05 |
| 584 | SurgiSAM2: Fine-tuning A Foundational Model for Surgical Video Anatomy Segmentation and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Methods: We utilized five public datasets to evaluate and fine-tune SAM 2 for segmenting anatomical tissues in surgical videos/images. |
DEVANISH N. KAMTAM et. al. | arxiv-cs.CV | 2025-03-05 |
| 585 | GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods primarily focus on embedding compressed CLIP features to 3D Gaussians, suffering from low object segmentation accuracy and lack spatial reasoning capabilities. To address these limitations, we propose GaussianGraph, a novel framework that enhances 3DGS-based scene understanding by integrating adaptive semantic clustering and scene graph generation. |
XIHAN WANG et. al. | arxiv-cs.CV | 2025-03-05 |
| 586 | TS‐Net: Trans‐Scale Network for Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate medical image segmentation is crucial for clinical diagnosis and disease treatment. However, there are still great challenges for most existing methods to extract … |
HuiFang Wang; Yatong Liu; Jiongyao Ye; Dawei Yang; Yu Zhu; | International Journal of Imaging Systems and Technology | 2025-03-01 |
| 587 | Pseudo 5D Hyperspectral Light Field for Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ruixuan Cong; Hao Sheng; Da Yang; Rongshan Chen; Zhenglong Cui; | Inf. Fusion | 2025-03-01 |
| 588 | Exploring The Better Correlation for Few-Shot Video Object Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Few-shot video object segmentation (FSVOS) aims to achieve accurate segmentation of novel objects in given video sequences, where the target objects are specified by limited … |
NAISONG LUO et. al. | IEEE Transactions on Circuits and Systems for Video … | 2025-03-01 |
| 589 | Attention Guided Filter and Refinement Feature Network for Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SHUSHENG LI et. al. | Knowl. Based Syst. | 2025-03-01 |
| 590 | Adaptive Sparse Lightweight Multi-scale Hybrid Network for Remote Sensing Image Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
HAONAN SUN et. al. | Expert Syst. Appl. | 2025-03-01 |
| 591 | DCSSGA-UNet: Biomedical Image Segmentation with DenseNet Channel Spatial and Semantic Guidance Attention IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Tahir Hussain; Hayaru Shouno; M. A. Mohammed; Haydar Abdulameer Marhoon; Taukir Alam; | Knowl. Based Syst. | 2025-03-01 |
| 592 | CMAA: Channel-wise Multi-scale Adaptive Attention Network for Metallographic Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yongliang Sun; Xiangyang Huang; | Expert Syst. Appl. | 2025-03-01 |
| 593 | CGViT: Cross-image GroupViT for Zero-shot Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jie Jiang; Xingjian He; Xinxin Zhu; Weining Wang; Jing Liu; | Pattern Recognit. | 2025-03-01 |
| 594 | Boundaries Matters: A Novel Multibranch Semisupervised Semantic Segmentation Method Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, semisupervised semantic segmentation (SSS) research has been progressing rapidly. Existing methods usually ignore the classification of detailed pixels, such as … |
Yitong Li; Changlun Zhang; Hengyou Wang; | IEEE Intelligent Systems | 2025-03-01 |
| 595 | FuseForm: Multimodal Transformer for Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: For semantic segmentation, integrating multimodal data can vastly improve segmentation performance at the cost of increased model complexity. We introduce FuseForm, a multimodal … |
Justin McMillen; Yasin Yilmaz; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-28 |
| 596 | Realistic Evaluation of Deep Active Learning for Image Classification and Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SUDHANSHU MITTAL et. al. | Int. J. Comput. Vis. | 2025-02-28 |
| 597 | Enhanced Neuromorphic Semantic Segmentation Latency Through Stream Event Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional frame-based methods often struggle to balance latency, accuracy, and energy efficiency. To address these challenges, we leverage event streams from event-based cameras-bio-inspired sensors that trigger events in response to changes in the scene. |
D. Hareb; J. Martinet; B. Miramond; | arxiv-cs.CV | 2025-02-26 |
| 598 | Learning Under Noisy Labels, Spurious Points, and Diverse Structures: TS40K, A 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Research in 3D scene understanding, particularly in autonomous driving and indoor segmentation, has made significant strides. However, most available datasets focus on urban … |
DIOGO MATEUS LAVADO et. al. | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-26 |
| 599 | SegBuilder: A Semi-Automatic Annotation Tool for Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper addresses the problem of image annotation for segmentation tasks. Semantic segmentation involves la-beling each pixel in an image with predefined categories, such as … |
Md. Alimoor Reza; Eric Manley; Sean Chen; Sameer Chaudhary; Jacob Elafros; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-26 |
| 600 | AiDe: Improving 3D Open-Vocabulary Semantic Segmentation By Aligned Vision-Language Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: 3D open-vocabulary semantic segmentation aims at recognizing countless categories beyond the limited set of annotations used in traditional settings. Due to the lack of … |
Yimu Wang; Krzysztof Czarnecki; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2025-02-26 |
| 601 | Efficient Event-Based Semantic Segmentation Via Exploiting Frame-Event Fusion: A Hybrid Neural Network Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing event-based semantic segmentation methods often fail to fully exploit the complementary information provided by frames and events, resulting in complex training strategies and increased computational costs. To address these challenges, we propose an efficient hybrid framework for image semantic segmentation, comprising a Spiking Neural Network branch for events and an Artificial Neural Network branch for frames. |
HEBEI LI et. al. | aaai | 2025-02-25 |
| 602 | SeeDiff: Off-the-Shelf Seeded Mask Generation from Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we take a closer look at attention mechanisms of Stable Diffusion, from which we draw connections with classical seeded segmentation approaches. |
Joon Hyun Park; Kumju Jo; Sungyong Baik; | aaai | 2025-02-25 |
| 603 | Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a unique neural model, leveraging advances from the state space and diffusion generative modeling to achieve remarkable 3D semantic scene completion performance with monocular image input. |
Li Liang; Naveed Akhtar; Jordan Vice; Xiangrui Kong; Ajmal Saeed Mian; | aaai | 2025-02-25 |
| 604 | Structural Pruning Via Spatial-aware Information Redundancy for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Within this framework, we introduce a spatial-aware redundancy metric based on feature maps, thus endowing the pruning process with location sensitivity to better adapt to pruning segmentation networks. |
Dongyue Wu; Zilin Guo; Li Yu; Nong Sang; Changxin Gao; | aaai | 2025-02-25 |
| 605 | Exploring Semantic Consistency and Style Diversity for Domain Generalized Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Domain randomization-based methods frequently incorporate domain-irrelevant noise due to the uncontrollability of style transformations, resulting in segmentation ambiguity. To address these challenges, we introduce a novel framework, named SCSD for Semantic Consistency prediction and Style Diversity generalization. |
Hongwei Niu; Linhuang Xie; Jianghang Lin; Shengchuan Zhang; | aaai | 2025-02-25 |
| 606 | SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Even worse, most of the existing approaches pay much attention to image-level information and ignore semantic features, resulting in the inability to perceive weak boundaries. To address these issues, we propose a novel Semantic-Guided Triplet Co-training (SGTC) framework, which achieves high-end medical image segmentation by only annotating three orthogonal slices of a few volumetric samples, significantly alleviating the burden of radiologists. |
Ke Yan; Qing Cai; Fan Zhang; Ziyan Cao; Zhi Liu; | aaai | 2025-02-25 |
| 607 | InvSeg: Test-Time Prompt Inversion for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This discrepancy hinders diffusion models from capturing accurate visual-textual correlations. To solve this, we propose InvSeg, a test-time prompt inversion method that tackles open-vocabulary semantic segmentation by inverting image-specific visual context into text prompt embedding space, leveraging structure information derived from the diffusion model’s reconstruction process to enrich text prompts so as to associate each class with a structure-consistent mask. |
Jiayi Lin; Jiabo Huang; Jian Hu; Shaogang Gong; | aaai | 2025-02-25 |
| 608 | Weakly Supervised Gland Segmentation with Class Semantic Consistency and Purified Labels Filtration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, for class consistency, we propose Consistency Correlation Attention (CCA) to encourage the network to focus on the contribution of class features to semantic dependencies. |
SIYANG FENG et. al. | aaai | 2025-02-25 |
| 609 | Multi-Granularity Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we aim to generate multi-granularity video segmentation dataset that is annotated for both salient and non-salient masks. |
SANGBEOM LIM et. al. | aaai | 2025-02-25 |
| 610 | Holistic Correction with Object Prototype for Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a Holistic Correction Network (HCNet) to adaptively acquire concise object prototypes for holistic correction at semantic, spatial and temporal aspects. |
Shengye Qiao; Changqun Xia; Yanjie Liang; Gongjin Lan; Jia Li; | aaai | 2025-02-25 |
| 611 | S2S2: Semantic Stacking for Robust Semantic Segmentation in Medical Imaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In response, we introduce a novel, domain-agnostic, add-on, and data-driven strategy inspired by image stacking in image denoising. |
Yimu Pan; Sitao Zhang; Alison D. Gernand; Jeffery A. Goldstein; James Z. Wang; | aaai | 2025-02-25 |
| 612 | Pointmap Association and Piecewise-Plane Constraint for Consistent and Compact 3D Gaussian Segmentation Field Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a result, optimization typically lacks awareness of semantic category information, which can result in floaters with ambiguous segmentation. To address these challenges, we introduce CCGS, a method designed to achieve both view consistent 2D segmentation and a compact 3D Gaussian segmentation field. |
WENHAO HU et. al. | arxiv-cs.CV | 2025-02-22 |
| 613 | Hybrid Deep Learning Aerial Framework for Road Scene Objects Segmentation and Classification IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This research proposes an advanced approach of object segmentation and categorization using aerial image sequences for enhancing intelligent traffic monitoring systems. … |
Aysha Naseer; Ahmad Jalal; | 2025 6th International Conference on Advancements in … | 2025-02-18 |
| 614 | From Open-Vocabulary to Vocabulary-Free Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes a Vocabulary-Free Semantic Segmentation pipeline, eliminating the need for predefined class vocabularies. |
KLARA REICHARD et. al. | arxiv-cs.CV | 2025-02-17 |
| 615 | NPSim: Nighttime Photorealistic Simulation From Daytime Images With Monocular Inverse Rendering and Ray Tracing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this thesis, we introduce a novel approach named NPSim, which enables the simulation of realistic nighttime images from real daytime counterparts with monocular inverse rendering and ray tracing. |
Shutong Zhang; | arxiv-cs.CV | 2025-02-15 |
| 616 | Prototype Contrastive Consistency Learning for Semi-Supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, although previous contrastive learning methods can mine semantic information from partial pixels within images, they ignore the whole context information of unlabeled images, which is very important to precise segmentation. In order to solve this problem, we propose a novel prototype contrastive learning method called Prototype Contrastive Consistency Segmentation (PCCS) for semi-supervised medical image segmentation. |
Shihuan He; Zhihui Lai; Ruxin Wang; Heng Kong; | arxiv-cs.CV | 2025-02-10 |
| 617 | Convolutional Neural Network Segmentation for Satellite Imagery Data to Identify Landforms Using U-Net Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The study applies the U-Net model for effective feature extraction by using Convolutional Neural Network (CNN) segmentation techniques. |
Mitul Goswami; Sainath Dey; Aniruddha Mukherjee; Suneeta Mohanty; Prasant Kumar Pattnaik; | arxiv-cs.CV | 2025-02-08 |
| 618 | Deep Unfolding Multi-modal Image Fusion Network Via Attribution Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although some approaches attempt to jointly optimize image fusion and downstream tasks, these efforts often lack direct guidance or interaction, serving only to assist with a predefined fusion loss. To address this, we propose an “Unfolding Attribution Analysis Fusion network” (UAAFusion), using attribution analysis to tailor fused images more effectively for semantic segmentation, enhancing the interaction between the fusion and segmentation. |
HAOWEN BAI et. al. | arxiv-cs.CV | 2025-02-03 |
| 619 | Image-text Aggregation for Open-vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Shengyang Cheng; Jianyong Huang; Xiaodong Wang; Lei Huang; Zhiqiang Wei; | Neurocomputing | 2025-02-01 |
| 620 | LBFormer: Scene Perception Segmentation Transformer Based on Local Block Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Scene perception for autonomous vehicles and vessels is crucial for autonomous navigation. Current mainstream transformer methods typically split the feature map into windows, … |
YONGJIE ZHANG et. al. | IEEE Transactions on Industrial Informatics | 2025-02-01 |
| 621 | Image-point Cloud Embedding Network for Simultaneous Image-based Farmland Instance Extraction and Point Cloud-based Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jinpeng Li; Yuan Li; Shuhang Zhang; Yiping Chen; | Int. J. Appl. Earth Obs. Geoinformation | 2025-02-01 |
| 622 | INF-PCA: Implicit Neural Field-Based Interactive Point Cloud Semantic Annotation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Point cloud semantic segmentation helps Intelligent Transportation Systems understand traffic scenes by assigning semantic label to each point in the point cloud, and it relies on … |
CHONG LIU et. al. | IEEE Transactions on Intelligent Transportation Systems | 2025-02-01 |
| 623 | Removing Visual Occlusion of Construction Scaffolds Via A Two-step Method Combining Semantic Segmentation and Image Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yuexiong Ding; Muyang Liu; Ming Zhang; Xiaowei Luo; | Eng. Appl. Artif. Intell. | 2025-02-01 |
| 624 | Boundary Semantic Interactive Aggregation Network for Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Fan Zhang; Qijun Lv; Binrong Pan; Yun Wang; | Expert Syst. Appl. | 2025-02-01 |
| 625 | Increase The Sensitivity of Moderate Examples for Semantic Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
QUAN TANG et. al. | Image Vis. Comput. | 2025-02-01 |
| 626 | Lifting By Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Lifting By Gaussians (LBG), a novel approach for open-world instance segmentation of 3D Gaussian Splatted Radiance Fields (3DGS). |
Rohan Chacko; Nicolai Haeni; Eldar Khaliullin; Lin Sun; Douglas Lee; | arxiv-cs.CV | 2025-01-31 |
| 627 | SynthmanticLiDAR: A Synthetic Dataset for Semantic Segmentation on LiDAR Imaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present a modified CARLA simulator designed with LiDAR semantic segmentation in mind, with new classes, more consistent object labeling with their counterparts from real datasets such as SemanticKITTI, and the possibility to adjust the object class distribution. |
Javier Montalvo; Pablo Carballeira; Álvaro García-Martín; | arxiv-cs.CV | 2025-01-31 |
| 628 | Side Information-driven Image Coding for Hybrid Machine–human Vision Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the development of machine learning, advanced photography and image transmission systems, images are being processed more and more by machines, so image coding for machines … |
Zhongpeng Zhang; Ying Liu; Wen-Hsiao Peng; | EURASIP Journal on Image and Video Processing | 2025-01-28 |
| 629 | Freestyle Sketch-in-the-Loop Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we expand the domain of sketch research into the field of image segmentation, aiming to establish freehand sketches as a query modality for subjective image segmentation. |
SUBHADEEP KOLEY et. al. | arxiv-cs.CV | 2025-01-27 |
| 630 | Cross-Domain Semantic Segmentation with Large Language Model-Assisted Descriptor Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose LangSeg, a novel LLM-guided semantic segmentation method that leverages context-sensitive, fine-grained subclass descriptors generated by LLMs. |
Philip Hughes; Larry Burns; Luke Adams; | arxiv-cs.CV | 2025-01-27 |
| 631 | D-PLS: Decoupled Semantic Segmentation for 4D-Panoptic-LiDAR-Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel approach to 4D Panoptic LiDAR Segmentation that decouples semantic and instance segmentation, leveraging single-scan semantic predictions as prior information for instance segmentation. |
Maik Steinhauser; Laurenz Reichardt; Nikolas Ebert; Oliver Wasenmüller; | arxiv-cs.CV | 2025-01-27 |
| 632 | Improved Gated Recurrent Units Together with Fusion for Semantic Segmentation of Remote Sensing Images Based on Parallel Hybrid Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: : Transformer together with convolutional neural network (CNN) has achieved better performance than the pure module-based methods. However, the advantages of both coding styles … |
Tongchi Zhou; Hongyu He; Yanzhao Wang; Yuan Liao; | Multim. Syst. | 2025-01-20 |
| 633 | Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods require massive parameter updates and computational effort during the feature extraction and fusion. To address this issue, we propose a novel multimodal fusion network (EFNet) based on an early fusion strategy and a simple but effective feature clustering for training efficient RGB-T semantic segmentation. |
Zhengwen Shen; Yulian Li; Han Zhang; Yuchen Weng; Jun Wang; | arxiv-cs.CV | 2025-01-19 |
| 634 | LSSMask: A Lightweight Semantic Segmentation Network for Dynamic Object Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xiaofeng Lian; Maomao Kang; Li Tan; Xiao Sun; Yanli Wang; | Signal Image Video Process. | 2025-01-17 |
| 635 | Surface-SOS: Self-Supervised Object Segmentation Via Neural Surface Representation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Under conditions of multi-camera inputs, the structural, textural and geometrical consistency among each view can be leveraged to achieve fine-grained object segmentation. To make better use of the above information, we propose Surface representation based Self-supervised Object Segmentation (Surface-SOS), a new framework to segment objects for each view by 3D surface representation from multi-view images of a scene. |
Xiaoyun Zheng; Liwei Liao; Jianbo Jiao; Feng Gao; Ronggang Wang; | arxiv-cs.CV | 2025-01-16 |
| 636 | Unsupervised Semantic Segmentation of Urban Scenes Via Cross-Modal Distillation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic image segmentation models typically require extensive pixel-wise annotations, which are costly to obtain and prone to biases. Our work investigates learning semantic … |
ANTONÍN VOBECKÝ et. al. | Int. J. Comput. Vis. | 2025-01-15 |
| 637 | Hierarchical Superpixel Segmentation Via Structural Information Theory Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: These approaches do not fully leverage the global information in the graph, leading to suboptimal segmentation quality. To address this limitation, we present SIT-HSS, a hierarchical superpixel segmentation method based on structural information theory. |
MINHUI XIE et. al. | arxiv-cs.CV | 2025-01-13 |
| 638 | Adaptive Noise-Tolerant Network for Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, instead of relying on clean segmentation labels, we study whether and how integrating imperfect or noisy segmentation results from off-the-shelf segmentation algorithms may help achieve better segmentation results through a new Adaptive Noise-Tolerant Network (ANTN) model. |
Weizhi Li; | arxiv-cs.CV | 2025-01-13 |
| 639 | RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, these approaches often struggle to establish robust alignments between fine-grained semantic concepts, leading to inconsistent representations across textual and visual information. To address these limitations, we introduce a referring remote sensing image segmentation foundational model, RSRefSeg. |
Keyan Chen; Jiafan Zhang; Chenyang Liu; Zhengxia Zou; Zhenwei Shi; | arxiv-cs.CV | 2025-01-12 |
| 640 | LarvSeg: Exploring Image Classification Data For Large Vocabulary Semantic Segmentation Via Category-wise Attentive Classifier Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a new large vocabulary semantic segmentation framework, called LarvSeg. |
HAOJUN YU et. al. | arxiv-cs.CV | 2025-01-12 |
| 641 | Multi-Grained Contrastive Learning for Text-Supervised Open-Vocabulary Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Learning open-vocabulary semantic segmentation (OVSS) from text supervision has recently received increasing attention for its promising potential in real-world applications. … |
Yajie Liu; Pu Ge; Guodong Wang; Qingjie Liu; Di-Wei Huang; | ACM Transactions on Multimedia Computing, Communications … | 2025-01-10 |
| 642 | Image Segmentation: Inducing Graph-based Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We compare our proposed UNet-GNN model against established convolutional neural networks (CNNs) based segmentation models, including U-Net and U-Net++, as well as the transformer-based SwinUNet. |
Aryan Singh; Pepijn Van de Ven; Ciarán Eising; Patrick Denny; | arxiv-cs.CV | 2025-01-07 |
| 643 | BEN: Using Confidence-Guided Matting for Dichotomous Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: As improvements in image segmentation become increasingly challenging to achieve, combining image matting and grayscale segmentation techniques offers promising new directions for architectural innovation. Inspired by the possibility of aligning these two model tasks, we propose a new architectural approach for DIS called Confidence-Guided Matting (CGM). |
Maxwell Meyer; Jack Spruyt; | arxiv-cs.CV | 2025-01-07 |
| 644 | LM-Net: A Light-weight and Multi-scale Network for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This results in over-segmentation, under-segmentation, and blurred segmentation boundaries. To tackle these challenges, we explore multi-scale feature representations from different perspectives, proposing a novel, lightweight, and multi-scale architecture (LM-Net) that integrates advantages of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) to enhance segmentation accuracy. |
Zhenkun Lu; Chaoyin She; Wei Wang; Qinghua Huang; | arxiv-cs.CV | 2025-01-07 |
| 645 | Superpixel Boundary Correction for Weakly-Supervised Semantic Segmentation on Histopathology Images Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, Class Activation Map (CAM)-based methods still suffer from low spatial resolution and unclear boundaries. To address these issues, we propose a multi-level superpixel correction algorithm that refines CAM boundaries using superpixel clustering and floodfill. |
Hongyi Wu; Hong Zhang; | arxiv-cs.CV | 2025-01-07 |
| 646 | The 2nd Place Solution from The 3D Semantic Segmentation Track in The 2024 Waymo Open Dataset Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this report, we introduce MixSeg3D, a sophisticated combination of the strong point cloud segmentation model with advanced 3D data mixing strategies. |
Qing Wu; | arxiv-cs.CV | 2025-01-06 |
| 647 | Enhancing Semantic Scene Segmentation for Indoor Autonomous Systems Using Advanced Attention-supported Improved UNet Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
HOANG N. TRAN et. al. | Signal Image Video Process. | 2025-01-06 |
| 648 | 4D-CS: Exploiting Cluster Prior for 4D Spatio-Temporal LiDAR Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these methods often overlook the segmentation consistency in space and time, which may result in point clouds within the same object being predicted as different categories. To handle this issue, our core idea is to generate cluster labels across multiple frames that can reflect the complete spatial structure and temporal information of objects. |
Jiexi Zhong; Zhiheng Li; Yubo Cui; Zheng Fang; | arxiv-cs.CV | 2025-01-06 |
| 649 | MedSegDiffNCA: Diffusion Models With Neural Cellular Automata for Skin Lesion Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes three NCA-based improvements for diffusion-based medical image segmentation. |
Avni Mittal; John Kalkhof; Anirban Mukhopadhyay; Arnav Bhavsar; | arxiv-cs.CV | 2025-01-05 |
| 650 | IAM: Enhancing RGB-D Instance Segmentation with New Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: There is a relative scarcity of instance-level RGB-D segmentation datasets, which restricts current methods to broad category distinctions rather than fully capturing the fine-grained details required for recognizing individual objects. To bridge this gap, we introduce three RGB-D instance segmentation benchmarks, distinguished at the instance level. |
Aecheon Jung; Soyun Choi; Junhong Min; Sungeun Hong; | arxiv-cs.CV | 2025-01-03 |
| 651 | SIT-SAM: A Semantic-integration Transformer That Adapts The Segment Anything Model to Zero-shot Medical Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Wentao Shi; Junjun He; Yiqing Shen; | Biomed. Signal Process. Control. | 2025-01-01 |
| 652 | Geographical Scenario Knowledge-Informed Graph Structure Attention for Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deep learning methods, renowned for their ability to discern physical features from images, are frequently used in the semantic segmentation of remote sensing images. However, … |
HUILING ZHAO et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 653 | Low-Light Enhancement and Global-Local Feature Interaction for RGB-T Semantic Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The performance of RGB-T semantic segmentation tasks is affected by the quality of visible (VIS) and infrared (IR) images captured by sensor instruments. In low-light … |
Xueyi Guo; Yisha Liu; Weimin Xue; Zhiwei Zhang; Zhuang Yan; | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 654 | Hierarchical Super-Pixels Graph Neural Networks for Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xavier Hoarau; Julien Mille; Hugo Raguet; Romain Raveaux; | Workshop on Graph Based Representations in Pattern … | 2025-01-01 |
| 655 | Robust Semantic Segmentation of Wafer Transmission Electron Microscopy Image Using Multi-Task Learning With Edge Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation for wafer transmission electron microscopy (TEM) images plays a crucial role in the automated measurement of semiconductors. However, the automated … |
YONG-DAE JO et. al. | IEEE Access | 2025-01-01 |
| 656 | L2A: Learning Affinity From Attention for Weakly Supervised Continual Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite significant advances in continual semantic segmentation (CSS), they still rely on the pixel-level annotation to train models, which is time-consuming and labor-intensive. … |
Hao Liu; Yong Zhou; Bing Liu; Ming Yan; Joey Tianyi Zhou; | IEEE Transactions on Circuits and Systems for Video … | 2025-01-01 |
| 657 | 3D Scene Segmentation: A Comprehensive Survey and Open Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Slavcho Neshev; Krasimir Tonchev; A. Manolova; Vladimir K. Poulkov; | IEEE Access | 2025-01-01 |
| 658 | SACU-Net: Shape-Aware U-Net for Biomedical Image Segmentation With Attention Mechanism and Context Extraction IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid development of convolutional neural networks in image processing, deep learning has been widely applied to medical image segmentation tasks, including liver, … |
Yinuo Cao; Yong Cheng; | IEEE Access | 2025-01-01 |
| 659 | HPAN: Hierarchical Part-Aware Network for Fine-Grained Segmentation of Street View Imagery Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Street view imagery (SVI) has become a valuable geospatial data source for urban analysis, offering rich information about urban environments from a human-centric perspective. … |
LEIYANG ZHONG et. al. | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 660 | Frequency-Aware Integrity Learning Network for Semantic Segmentation of Remote Sensing Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The semantic segmentation of remote sensing images is crucial for computer perception tasks. Integrating dual-modal information enhances semantic understanding. However, existing … |
Penghan Yang; Wujie Zhou; Yuanyuan Liu; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 661 | Tuning A SAM-Based Model With Multicognitive Visual Adapter to Remote Sensing Instance Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The segment anything model (SAM), a foundational model designed for promptable segmentation tasks, demonstrates exceptional generalization capabilities, making it highly promising … |
Linghao Zheng; Xinyang Pu; Su Zhang; Feng Xu; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 662 | Deep Merge: Deep-Learning-Based Region Merging for Remote Sensing Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Image segmentation represents a fundamental step in analyzing very high-spatial-resolution (VHR) remote sensing imagery. Its objective is to partition an image into segments that … |
XIANWEI LV et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 663 | DSMF-Net: Dual Semantic Metric Learning Fusion Network for Few-Shot Aerial Image Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation of aerial images is crucial yet resource-intensive. Inspired by human ability to learn rapidly, few-shot semantic segmentation offers a promising solution by … |
XIYU QI et. al. | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 664 | UM2Former: U-Shaped Multimixed Transformer Network for Large-Scale Hyperspectral Image Semantic Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Transformer-based deep learning (DL) methods have gradually been advocated for remote sensing (RS) image semantic segmentation due to the great global modeling capability. … |
AIJUN XU et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 665 | Probability-Guided Edge Enhancement Network for Remote Sensing Image Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation in remote sensing images (RSIs) assigns unique semantic labels to each pixel and plays a crucial role in real-world applications such as environmental change … |
Chunyan Yu; Yakun Zuo; Qiang Zhang; Yulei Wang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 666 | 2TSS: Two-Tier Semantic Segmentation Framework With Enhancement for Hotspot Detection of Solar Photovoltaic Thermal Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recently, intelligence-based hotspot detection has been widely used in solar photovoltaic (PV) image applications. However, the semantic segmentation approach has limitations in … |
NURUL HUDA ISHAK et. al. | IEEE Access | 2025-01-01 |
| 667 | Asymmetric Mamba–CNN Collaborative Architecture for Large-Size Remote Sensing Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JINBO ZHANG et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 668 | Depth Enhancement Mask Mapping Network With Multi-Teacher Distillation for RGB-D Scene Parsing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The proposed method in the study, called DEMMNet-KD, aims to address the challenges of scene parsing in RGB-D indoor scenes. The study recognizes the importance of lightweight … |
Yuxiang Xiao; Jiajun Meng; Fangfang Qiang; Xiena Dong; Wujie Zhou; | IEEE Transactions on Automation Science and Engineering | 2025-01-01 |
| 669 | Efficient Semantic Segmentation of Remote Sensing Images Through Global-Local Feature Integration Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The rapid acquisition of remote sensing information plays a significant role in the development of image semantic segmentation methods for remote sensing image interpretation … |
Fengyi Zhang; Xiuyu Xia; | IEEE Access | 2025-01-01 |
| 670 | PromptSeg: Prompt for Universal Remote Sensing Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation in remote sensing (RS) involves classifying each pixel of an image into predefined categories. Despite existing deep learning-based methods having … |
Jie Zhang; Mingwen Shao; Lingzhuang Meng; Xiangyong Cao; Shuigen Wang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 671 | Grid Point Serialized Transformer for LiDAR Point Cloud Semantic Segmentation in Various Densities and Heights Scenes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Point cloud semantic segmentation is among the important tasks to achieve comprehensive perception of 3-D environments. However, current segmentation methods suffer from limited … |
Huchen Li; Wu-da Huang; Jiacheng Liu; Ke Chen; Fei Deng; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 672 | Multiscale Semantic Segmentation of Remote Sensing Images Based on Edge Optimization IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation of remote sensing images is crucial for disaster monitoring, urban planning, and land use. Due to scene complexity and multiscale features of targets, … |
Wu-da Huang; Fei Deng; Haibing Liu; Mingtao Ding; Qi Yao; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 673 | Boosting Few-Shot Semantic Segmentation With Prior-Driven Edge Feature Enhancement Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Few-shot semantic segmentation (FSS) focuses on segmenting objects of novel classes with only a small number of annotated samples and has achieved great development. However, … |
Jingkai Ma; Shuang Bai; Wenchao Pan; | IEEE Transactions on Artificial Intelligence | 2025-01-01 |
| 674 | Peering Into The Heart: A Comprehensive Exploration of Semantic Segmentation and Explainable AI on The MnMs-2 Cardiac MRI Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate and interpretable segmentation of medical images is crucial for computer-aided diagnosis and image-guided interventions. This study explores the integration of semantic … |
Mohamed Ayoob; Oshan Nettasinghe; Vithushan Sylvester; Helmini Bowala; Hamdaan Mohideen; | Applied Computer Systems | 2025-01-01 |
| 675 | PLDKD-Net: Pixel-Level Discriminative Knowledge Distillation for Surgical Scene Segmentation With Graph-Based Visual Parsing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Efficient laparoscopic scene segmentation holds significant potential for surgical assistive intelligence and image-guided task autonomy in robotic surgery. However, the abdominal … |
BO LU et. al. | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 676 | Look Twice and Closer: A Coarse-to-Fine Segmentation Network for Small Objects in Remote Sensing Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Convolutional neural networks (CNNs) are frequently used to analyze remote sensing images and achieve impressive progress. Limited by the receptive field size of CNNs, small … |
Silin Chen; Qingzhong Wang; Kangjian Di; Haoyi Xiong; Ningmu Zou; | IEEE Signal Processing Letters | 2025-01-01 |
| 677 | Vision Foundation Model-Driven Multiscale Expert Tuning for Multimodal Remote Sensing Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multimodal remote sensing semantic segmentation based on optical and digital surface model (Opt-DSM) data is pivotal for comprehensive scene interpretation. However, prevailing … |
Jiayuan Li; Zhen Wang; Nan Xu; Zhuhong You; Deshuang Huang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 678 | Sea Ice Semantic Segmentation in Optical Image Based on Adaptive Training Sample Selection and Cross-Attention ResUNet Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The formation of numerous channels among Arctic sea ice provides potential routes for Arctic navigation and the identification and semantic segmentation of sea ice becomes a … |
Zhiyong Yin; Yuqi Tang; F. Bovolo; | IEEE Geoscience and Remote Sensing Letters | 2025-01-01 |
| 679 | Multiview Integration Network for Multitask Robotic Surgical Scene Analysis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Surgical scene analysis holds a pivotal role in robot-assisted surgery. However, existing methods often suffer from a single or few views, leading to erroneous scene analysis … |
WENTING SHEN et. al. | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 680 | Efficient Real-Time Pathfinding for Visually Impaired Individuals Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a novel computer vision system, which enables real-time pathfinding for individuals with visual impairments. The navigation experience for visually impaired … |
Tadeh Ghahremanians; Hossein Mahvash Mohammadi; | IEEE Access | 2025-01-01 |
| 681 | LHAS: A Lightweight Network Based on Hierarchical Attention for Hyperspectral Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deep learning has garnered extensive attention in hyperspectral image (HSI) processing. However, its application in HSI semantic segmentation tasks has been relatively limited. … |
LUJIE SONG et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 682 | A Decoupled Segmentation-Classification Strategy Based on Semantic-SAM for Precise Semantic Segmentation in Coal Mine Areas Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: To address complex semantic segmentation in coal mine areas, this study proposes the SAM-SEF (SAM-based Semantic Enhancement Framework). It integrates Semantic-SAM’s zero-shot … |
LIBING WANG et. al. | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 683 | Dense Segmentation Techniques Using Deep Learning for Urban Scene Parsing: A Review Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Dense segmentation tasks, including semantic, instance, and panoptic segmentation, are essential for improving our comprehension of urban landscapes. This paper examines various … |
Rajesh Ankareddy; Radhakrishnan Delhibabu; | IEEE Access | 2025-01-01 |
| 684 | SalsaNext+: A Multimodal-Based Point Cloud Semantic Segmentation With Range and RGB Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Advances in sensor fusion techniques are redefining the landscape of 3D point cloud semantic segmentation, particularly for autonomous driving applications. We propose an enhanced … |
FABIO SÁNCHEZ-GARCÍA et. al. | IEEE Access | 2025-01-01 |
| 685 | Semantic Co-Occurrence and Relationship Modeling for Remote Sensing Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is an important but challenging task in pixel-level remote sensing (RS) data analysis. Accurate segmentation is essential for applications such as land use … |
Yinxing Zhang; Haochen Song; Qingwang Wang; Pengcheng Jin; Tao Shen; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 686 | Boundary-aware Semantic Segmentation of Remote Sensing Images Via Segformer and Snake Convolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation of remote sensing images remains challenging due to complex object structures and varying scales. This paper proposes a novel hybrid segmentation model that … |
Yanting Xia; Lin Zhang; Ting Guo; Q. Jin; | Comput. Sci. Inf. Syst. | 2025-01-01 |
| 687 | HMAFNet: Hybrid Mamba-Attention Fusion Network for Remote Sensing Image Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Remote sensing (RS) images have rich ground information, diverse object types, and large-scale differences, and these characteristics make difficulties in achieving precise … |
Haoyue Sun; Jianjun Liu; Jinlong Yang; Zebin Wu; | IEEE Geoscience and Remote Sensing Letters | 2025-01-01 |
| 688 | A Comparative Study of Deep Learning Semantic Segmentation Models for Kidney Segmentation in Ultrasound Images Using The Open Kidney Ultrasound Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate segmentation of kidney ultrasound (KUS) images is essential for various diagnostic procedures. However, challenges such as variability in image quality and the need for … |
MOHAMMAD I. DAOUD et. al. | IEEE Access | 2025-01-01 |
| 689 | Enhanced BP Algorithm Combined With Semantic Segmentation and Subaperture for Improving Agricultural Scene Image Quality in GEO SAR Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Geosynchronous synthetic aperture radar (GEO SAR) plays a crucial role in various fields, such as crop growth monitoring, irrigation management, terrain and soil analysis, and … |
YIFAN WU et. al. | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 690 | Clustering-Based Adaptive Query Generation for Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is one of the crucial tasks in the field of computer vision, aiming to label each pixel according to its class. Most recently, several semantic segmentation … |
Yeong Woo Kim; Wonjun Kim; | IEEE Signal Processing Letters | 2025-01-01 |
| 691 | An Alternating Guidance With Cross-View Teacher–Student Framework for Remote Sensing Semi-Supervised Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The semantic segmentation of remote sensing images is crucial for Earth observation. The semi-supervised semantic segmentation method can effectively reduce the dependence of the … |
Yujia Fu; Mingyang Wang; G. Vivone; Yunhong Ding; Lin Zhang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 692 | An Improved Lightweight Tongue Image Semantic Segmentation Model Based on DeepLabV3+ Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YAN TANG et. al. | Biomed. Signal Process. Control. | 2025-01-01 |
| 693 | SAM Enhanced Semantic Segmentation for Remote Sensing Imagery Without Additional Training IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is a critical process in remote sensing image analysis, supporting various applications. The recent development of the segment anything model (SAM), a visual … |
YANG QIAO et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 694 | FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: As a result, information extracted directly from VLMs can’t meet the requirements of segmentation tasks. To address this limitation, we propose FGAseg, a model designed for fine-grained pixel-text alignment and category boundary supplementation. |
Bingyu Li; Da Zhang; Zhiyuan Zhao; Junyu Gao; Xuelong Li; | arxiv-cs.CV | 2025-01-01 |
| 695 | INVITATION: A Framework for Enhancing UAV Image Semantic Segmentation Accuracy Through Depth Information Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the increasing use of uncrewed aerial vehicles (UAVs), improving the accuracy of semantic segmentation is becoming critical. Depth information preserves geometric structure, … |
XIAODONG ZHANG et. al. | IEEE Geoscience and Remote Sensing Letters | 2025-01-01 |
| 696 | Semantic Uncertainty-Awared for Semantic Segmentation of Remote Sensing Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Remote sensing image segmentation is crucial for applications ranging from urban planning to environmental monitoring. However, traditional approaches struggle with the unique … |
XIANGFENG QIU et. al. | IET Image Process. | 2025-01-01 |
| 697 | CSFNet: Cross-Modal Semantic Focus Network for Semantic Segmentation of Large-Scale Point Clouds IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation of large-scale point clouds is an indispensable component of outdoor scene perception, providing essential 3-D semantic insights for applications in scene … |
YANG LUO et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 698 | LNFormer: Lightweight Design for Nighttime Semantic Segmentation With Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: General image semantic segmentation methods mainly focus on daytime images with sufficient light, nighttime images have low contrast and blurred details compared to daytime … |
Longsheng Wei; Yuhang Liao; | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 699 | BaAFN: A Boundary-Aware Attention Fusion Network for Remote Sensing Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The performance of remote sensing semantic segmentation on object boundaries and small objects continues to pose a significant challenge due to the semantics near them being … |
Jiaen Chen; Shengjie Xu; Yuchen Zheng; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 700 | SiMultiF: A Remote Sensing Multimodal Semantic Segmentation Network With Adaptive Allocation of Modal Weights for Siamese Structures in Multiscene IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation of remote sensing images is crucial for resource exploration, precision agriculture, and environmental monitoring. However, conducting semantic segmentation … |
SHICHAO CUI et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 701 | Tuple Perturbation-Based Contrastive Learning Framework for Multimodal Remote Sensing Image Semantic Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deep learning models exhibit promising potential in multimodal remote sensing image semantic segmentation (MRSISS). However, the constrained access to labeled samples for training … |
Y. YE et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 702 | Wetland Scene Segmentation of Remote Sensing Images Based on Lie Group Feature and Graph Cut Model Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Given the increasingly severe destruction of wetlands in recent years, research and monitoring for wetland protection are urgently needed. However, wetland monitoring still faces … |
Canyu Chen; Guobin Zhu; Xiliang Chen; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 703 | BiFormer for Scene Graph Generation Based on VisionNet With Taylor Hiking Optimization Algorithm Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Scene Graph Generation (SGG) plays a vital role in determining the graph structure of an image by classifying objects based on their pairwise visual relationships. In the SGG, … |
S. Monesh; N. C. Senthilkumar; | IEEE Access | 2025-01-01 |
| 704 | Open-Vocabulary High-Resolution Remote Sensing Image Semantic Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Open-vocabulary image semantic segmentation (OVS) seeks to segment images into semantic regions across an open set of categories. Existing OVS methods commonly depend on … |
Qinglong Cao; Yuntian Chen; Chao Ma; Xiaokang Yang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 705 | GLM-SFNet: Global-Local Vision-Mamba with Semantic Fusion for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JIAHUI CHEN et. al. | International Conference on Medical Image Computing and … | 2025-01-01 |
| 706 | Adaptive Progressive Transformer-Based Trajectory Prediction Under Fine-Grained Trajectory-Scene Interaction Constraint Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Trajectory prediction is crucial in understanding human behavior around intelligent agents, such as self-driving vehicles or social robots. Nevertheless, conventional approaches … |
Rongrong Ni; Shichen Lu; Chuanping Hu; Biao Yang; | IEEE Transactions on Automation Science and Engineering | 2025-01-01 |
| 707 | Style Adaptation for Avoiding Semantic Inconsistency in Unsupervised Domain Adaptation Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ziqiang Liu; Zhaomin Chen; Huiling Chen; Shu Teng; Lei Chen; | Biomed. Signal Process. Control. | 2025-01-01 |
| 708 | Data Fusion and Models Integration for Enhanced Semantic Segmentation in Remote Sensing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Remote sensing semantic segmentation is a key research area in the remote sensing domain. Despite advancements, there is still no unified standard dataset such as ImageNet for … |
Xiaorui Dong; Jiansheng Li; Qingfang Chang; Shufeng Miao; Hongxiang Wan; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 709 | Object-Based Semantic Fusion Algorithm of Lidar and Camera Via Inverse Projection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Currently, multisensor fusion for point cloud semantic segmentation plays a pivotal role in robotics and autonomous driving. Lidar and the camera are two commonly used sensors, … |
XINGYU YUAN et. al. | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 710 | Deep Geospatio-Semantic Guided Network With Pseudo-Label Consistency for Domain-Adaptive Remote Sensing Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Domain-adaptive remote sensing image (RSI) semantic segmentation mitigates the overfitting problem that affects the effectiveness of segmentation, which results from the scarcity … |
Jiawei Ning; Zhongle Ren; Biao Hou; Weibin Li; Licheng Jiao; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 711 | Unsupervised Remote Sensing Image Semantic Segmentation Based on Multiscale Contrastive Domain Adaptation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Unsupervised domain adaptation (UDA) for remote sensing image semantic segmentation aims to train a deep model on the labeled source domain and apply it to the unlabeled target … |
Jie Geng; Shuai Song; Zhen Xu; Wen Jiang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 712 | An Improved Method for Zero-Shot Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Zero-shot semantic segmentation continues to face challenges in effectively handling unseen object classes, despite its critical applications in medical imaging, autonomous … |
Kong Kuok Yong; Tan Fong Ang; Chin Soon Ku; Firdaus Sahran; Lip Yee Por; | IEEE Access | 2025-01-01 |
| 713 | Confidence-Guided Joint Complementary Learning for Weakly Annotated Remote Sensing Object Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Object segmentation from weakly annotated remote sensing images (RSIs) is an essential task that helps substantially reduce pixelwise labeling costs. Although mainstream … |
Yanan Liu; Libao Zhang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 714 | A Hybrid Medical Image Semantic Segmentation Network Based on Novel Mamba and Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jianting Shi; Huanhuan Liu; Zhijun Li; | IET Image Process. | 2025-01-01 |
| 715 | DiffRSS: A Diffusion-Guided Multi-Scale Features Remote Sensing Image Segmentation Method Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation in remote sensing is a fundamental task with crucial applications across various domains. Traditional approaches primarily utilize bottom-up discriminative … |
Honghao Liu; Ruixia Yang; Yue Xu; Zhengchao Chen; Yuyang Zheng; | IEEE Access | 2025-01-01 |
| 716 | UAVSeg: Dual-Encoder Cross-Scale Attention Network for UAV Images’ Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Benefiting from the powerful feature extraction and feature correlation modeling capabilities of convolutional neural networks (CNNs) and Transformer models, these techniques have … |
Zhen Wang; Zhuhong You; Nan Xu; Chuanlei Zhang; De-Shuang Huang; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 717 | RML: Efficient Representation Mutual Learning Framework for End-to-End Weakly Supervised Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Research on efficient semantic segmentation models is increasing the number of instrumentation and measurement applications. In recent years, there has been significant progress … |
Rongtao Xu; Changwei Wang; Shibiao Xu; Weiliang Meng; Xiaopeng Zhang; | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 718 | Dual-Level Masked Semantic Inference for Semi-Supervised Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semi-supervised semantic segmentation pursues a holistic pixel-wise understanding of unseen images with limited annotation. To this end, existing methods focus on regularizing … |
QIANKUN MA et. al. | IEEE Transactions on Multimedia | 2025-01-01 |
| 719 | Robust One-Stop Multi-Modality Image Registration-Fusion-Segmentation Framework Against Misalignments and Adversarial Attacks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In complex open scenes, multi-modality image fusion and segmentation encounter two challenges: i) Imaging misalignments, manifested as pixel shifts and structural distortions, are … |
Di Wang; Xianghao Jiao; Jinyuan Liu; Xin-Yue Fan; | IEEE Transactions on Multimedia | 2025-01-01 |
| 720 | LEST: Large-Scale LiDAR Semantic Segmentation With Deployment-Friendly Transformer Architecture Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large-scale LiDAR-based point cloud semantic segmentation is a critical challenge for autonomous driving perception. Most state-of-the-art LiDAR semantic segmentation methods rely … |
CHUANYU LUO et. al. | IEEE Access | 2025-01-01 |
| 721 | PSDA: Pyramid Spatial Deformable Aggregation for Building Segmentation in Multiview Remote Sensing Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As increasingly more deep learning models are designed and implemented, the performance of single-view image semantic segmentation is approaching its upper limit. With the … |
XUEJUN HUANG et. al. | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 722 | Cross-Band Correlation-Aware Interactive Fusion for Multispectral Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multispectral homogeneous bands capture distinct and complementary spectral characteristics; therefore, fusing multiple bands has the potential to increase semantic segmentation … |
I. Ulku; O. Ozgur Tanriover; Erdem Akagündüz; | IEEE Geoscience and Remote Sensing Letters | 2025-01-01 |
| 723 | DMoC-UNet: A Dynamic Mixture-of-Convolution Network for Enhanced Pathological Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Pathological image segmentation is a cornerstone in medical image analysis and is crucial for tumor detection, tissue classification, and pathological diagnosis. Existing methods … |
JINGWEI ZHU et. al. | IEEE Access | 2025-01-01 |
| 724 | Robust Semantic Learning for Precise Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
SNEHASHIS CHAKRABORTY et. al. | Biomed. Signal Process. Control. | 2025-01-01 |
| 725 | DarkSegNet: Low-light Semantic Segmentation Network Based on Image Pyramid IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Jintao Tan; Longyang Huang; Zhonghui Chen; Ruokun Qu; Chenglong Li; | Signal Process. Image Commun. | 2025-01-01 |
| 726 | FBINet: Few-Shot Semantic Segmentation With Foreground and Background Iteration Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Defect detection methods based on few-shot segmentation are becoming more and more popular in industrial applications, and few-shot segmentation methods need to use only a limited … |
Zhifu Huang; Ziwei Chen; Yu Liu; | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 727 | A Lightweight Semantic Segmentation Network Based on Self-Attention Mechanism and State Space Model for Efficient Urban Scene Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the semantic segmentation of remote sensing images, methods based on convolutional neural networks (CNNs) and Transformers have been extensively studied. Nevertheless, CNN … |
Langping Li; Jizheng Yi; Hui Fan; Hui Lin; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 728 | Optimized Ensemble Learning for Semantic Segmentation of Satellite Imagery Using DeepLabV3+ and UNet With PSO and Cross-Dataset Evaluation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is critical in remote sensing applications such as urban planning, disaster management, and environmental monitoring. However, segmenting complex satellite … |
Gurjot Kaur; Salil Bharany; Dalia H. Elkamchouchi; Seongki Kim; | IEEE Access | 2025-01-01 |
| 729 | Cross-Scale Feature Interaction Network for Semantic Segmentation in Side-Scan Sonar Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic Segmentation in side-scan sonar images (SSS-Seg) is an emerging topic and plays an important function in sonar image interpretation. However, due to the interference of … |
Zhen Wang; Zhuhong You; Nan Xu; Buhong Wang; De-Shuang Huang; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 730 | FiVOS: A Fish Segmentation Algorithm Based on Interactive Video Object Segmentation and Filter Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YUQING DUAN et. al. | Comput. Electron. Agric. | 2025-01-01 |
| 731 | Semantic Segmentation of Assembly Images Combining Deep Learning and Ontological Reasoning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the advancement of industrial intelligence, semantic segmentation of assembly images is increasingly widely used in automated production and other fields. Aiming at the … |
Han Zhang; Xiaolin Shi; Haisong Xu; Yi Li; Liping Ma; | IEEE Access | 2025-01-01 |
| 732 | MCKTNet: Multiscale Cross-Modal Knowledge Transfer Network for Semantic Segmentation of Remote Sensing Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multimodal data fusion can provide valuable and diverse information for remote sensing image segmentation. However, different modal data have different feature distributions, … |
Jian Cui; Jiahang Liu; Yue Ni; Yuan Sun; Mao-yin Guo; | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 733 | SAM2Former: Segment Anything Model 2 Assisting UNet-Like Transformer for Remote Sensing Image Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Remote sensing semantic segmentation plays a crucial role in the fields of land cover classification, disaster monitoring, and urban planning. However, due to the high complexity … |
XUEWEN LI et. al. | IEEE Access | 2025-01-01 |
| 734 | Semantic Segmentation-Driven Knowledge Distillation-Based Infrared Visible Image Fusion Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The goal of infrared and visible image fusion is to generate a fused image that integrates both prominent targets and fine textures. However, many existing fusion algorithms … |
Xingshuo Wang; | IEEE Access | 2025-01-01 |
| 735 | PMTSeg: Prompt-Driven Multimodal Transformer for Task-Adapted Remote Sensing Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Multimodal remote sensing image segmentation (MRSIS) is important for intelligent remote sensing (RS) image interpretation, which encompasses three distinct tasks: semantic … |
KEJUN LIU et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 736 | SSDFusion: A Semantic Segmentation Driven Framework for Infrared and Visible Image Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Fusing infrared images with visible images facilitates obtaining more abundant and accurate information content. However, existing infrared and visible image fusion methods often … |
QISHEN LV et. al. | IEEE Access | 2025-01-01 |
| 737 | OMRF-HS: Object Markov Random Field With Hierarchical Semantic Regularization for High-Resolution Image Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: As spatial resolution increases in remote-sensing imagery, the challenge of semantic segmentation intensifies due to the need to discern intricate changes in terrain. Terrain, a … |
HAOYU FU et. al. | IEEE Transactions on Geoscience and Remote Sensing | 2025-01-01 |
| 738 | Noise-Resilient With Scattering-Aware Network for SAR Image Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Synthetic Aperture Radar (SAR) image semantic segmentation faces significant challenges, including severe speckle noise, intricate land cover patterns, weak feature background … |
Zhen Wang; Jiayuan Li; Nan Xu; Zhuhong You; | IEEE Journal of Selected Topics in Applied Earth … | 2025-01-01 |
| 739 | EPNet: An Efficient Postprocessing Network for Enhancing Semantic Segmentation in Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is of great importance in the field of autonomous driving, as it provides semantic information for a scene that intelligent vehicles need to interact with. … |
Libo Sun; Jiatong Xia; Hui Xie; Changming Sun; | IEEE Transactions on Instrumentation and Measurement | 2025-01-01 |
| 740 | PanoSLAM: Panoptic 3D Scene Reconstruction Via Gaussian SLAM Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce PanoSLAM, the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework. |
RUNNAN CHEN et. al. | arxiv-cs.CV | 2024-12-31 |
| 741 | OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose \textbf{OVGaussian}, a generalizable \textbf{O}pen-\textbf{V}ocabulary 3D semantic segmentation framework based on the 3D \textbf{Gaussian} representation. |
RUNNAN CHEN et. al. | arxiv-cs.CV | 2024-12-31 |
| 742 | HisynSeg: Weakly-Supervised Histopathological Image Segmentation Via Image-Mixing Synthesis and Consistency Regularization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, CAM-based methods are prone to suffer from under-activation and over-activation issues, leading to poor segmentation performance. To address this problem, we propose a novel weakly-supervised semantic segmentation framework for histopathological images based on image-mixing synthesis and consistency regularization, dubbed HisynSeg. |
Zijie Fang; Yifeng Wang; Peizhang Xie; Zhi Wang; Yongbing Zhang; | arxiv-cs.CV | 2024-12-30 |
| 743 | LiDAR-Camera Fusion for Video Panoptic Segmentation Without Video Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work seeks to introduce a feature fusion module that enhances PS and VPS by fusing LiDAR and image data for autonomous vehicles. |
Fardin Ayar; Ehsan Javanmardi; Manabu Tsukada; Mahdi Javanmardi; Mohammad Rahmati; | arxiv-cs.CV | 2024-12-30 |
| 744 | LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a Language-Embedded Surface Field (LangSurf), which accurately aligns the 3D language fields with the surface of objects, facilitating precise 2D and 3D segmentation with text query, widely expanding the downstream tasks such as removal and editing. |
HAO LI et. al. | arxiv-cs.CV | 2024-12-23 |
| 745 | Multi-Scale Foreground-Background Confidence for Out-of-Distribution Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a multi-scale OOD segmentation method that exploits the confidence information of a foreground-background segmentation model. |
Samuel Marschall; Kira Maag; | arxiv-cs.CV | 2024-12-22 |
| 746 | Leveraging Contrastive Learning for Semantic Segmentation with Consistent Labels Across Varying Appearances Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper introduces a novel synthetic dataset that captures urban scenes under a variety of weather conditions, providing pixel-perfect, ground-truth-aligned images to … |
JAVIER MONTALVO et. al. | ArXiv | 2024-12-21 |
| 747 | Imaging Segmentation of Brain Tumors Based on The Modified U-net Method IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Brain tumor segmentation in medical image analysis is a challenging task. Deep learning techniques have recently shown promise in resolving a variety of computer vision problems, … |
Yajie Zhang; Hea Choon Ngo; Yifan Zhang; Noor Fazilla Abd Yusof; Xiaohan Wang; | Inf. Technol. Control. | 2024-12-21 |
| 748 | VerSe: Integrating Multiple Queries As Prompts for Versatile Cardiac MRI Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, they are semi-automatic and inefficient, due to their reliance on click-based prompts, especially for 3D cardiac MRI volumes. To address these limitations, we propose VerSe, a Versatile Segmentation framework to unify automatic and interactive segmentation through mutiple queries. |
BANGWEI GUO et. al. | arxiv-cs.CV | 2024-12-20 |
| 749 | VerSe: Integrating Multiple Queries As Prompts for Versatile Cardiac MRI Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Despite the advances in learning-based image segmentation approach, the accurate segmentation of cardiac structures from magnetic resonance imaging (MRI) remains a critical … |
BANGWEI GUO et. al. | ArXiv | 2024-12-20 |
| 750 | Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we treat segmentation as tokenizing pixels and study a united perceptual and semantic token compression for all granular understanding and consequently facilitate open vocabulary semantic segmentation. |
Jianyu Zhang; Li Zhang; Shijian Li; | arxiv-cs.CV | 2024-12-18 |
| 751 | Language-guided Medical Image Segmentation with Target-informed Multi-level Contrastive Alignments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we propose a language-guided segmentation network with Target-informed Multi-level Contrastive Alignments (TMCA). |
MINGJIAN LI et. al. | arxiv-cs.CV | 2024-12-18 |
| 752 | Edge-Centric Real-Time Segmentation for Autonomous Underwater Cave Exploration Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper addresses the challenge of deploying machine learning (ML)-based segmentation models on edge platforms to facilitate real-time scene segmentation for Autonomous … |
MOHAMMADREZA MOHAMMADI et. al. | 2024 International Conference on Machine Learning and … | 2024-12-18 |
| 753 | Open-World Panoptic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, we tackle the problem of open-world panoptic segmentation, i.e., the task of discovering new semantic categories and new object instances at test time, while enforcing consistency among the categories that we incrementally discover. |
Matteo Sodano; Federico Magistri; Jens Behley; Cyrill Stachniss; | arxiv-cs.CV | 2024-12-17 |
| 754 | SEG-SAM: Semantic-Guided SAM for Unified Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: SAM has shown promising binary segmentation performance in natural domains, however, transferring it to the medical domain remains challenging, as medical images often possess substantial inter-category overlaps. To address this, we propose the SEmantic-Guided SAM (SEG-SAM), a unified medical segmentation model that incorporates semantic medical knowledge to enhance medical segmentation performance. |
SHUANGPING HUANG et. al. | arxiv-cs.CV | 2024-12-17 |
| 755 | Efficient Image Transmission Using Semantic Communication for Static Environment Surveillance Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic Communication is a novel approach that focuses on transmitting only essential information, leading to significant reductions in the number of bits required and conserving … |
K. R. Nandakishore; Rayani Venkat; Sai Rithvik; Mohammed Zafar; Ali Khan; | 2024 IEEE International Conference on Advanced Networks and … | 2024-12-15 |
| 756 | PRS-Net: A Few-shot Network for Perirenal Fat and Renal Parenchyma Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This work proposes a Few-shot network employed for Perirenal Fat and Renal Parenchyma Semantic Segmentation (PRS-Net) in Diabetic Kidney Disease (DKD). PRS-Net integrates ST … |
XIANGYU MENG et. al. | 2024 IEEE International Conference on Big Data (BigData) | 2024-12-15 |
| 757 | Classification Drives Geographic Bias in Street Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We examined if instance segmentation models trained on European driving scenes (Eurocentric models) are geo-biased. |
Rahul Nair; Gabriel Tseng; Esther Rolf; Bhanu Tokas; Hannah Kerner; | arxiv-cs.CV | 2024-12-15 |
| 758 | DCSEG: Decoupled 3D Open-Set Segmentation Using Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present a decoupled 3D segmentation pipeline to ensure modularity and adaptability to novel 3D representations as well as semantic segmentation foundation models. |
Luis Wiedmann; Luca Wiehe; David Rozenberszki; | arxiv-cs.CV | 2024-12-14 |
| 759 | SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, We introduce SuperGSeg, a novel approach that fosters cohesive, context-aware scene representation by disentangling segmentation and language field distillation. |
SIYUN LIANG et. al. | arxiv-cs.CV | 2024-12-13 |
| 760 | SPT: Sequence Prompt Transformer for Interactive Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing methods typically process one image at a time, failing to consider the sequential nature of the images. To overcome this limitation, we propose a novel method called Sequence Prompt Transformer (SPT), the first to utilize sequential image information for interactive segmentation. |
Senlin Cheng; Haopeng Sun; | arxiv-cs.CV | 2024-12-13 |
| 761 | A Deep Semantic Segmentation Network with Semantic and Contextual Refinements Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is a fundamental task in multimedia processing, which can be used for analyzing, understanding, editing contents of images and videos, among others. To … |
ZHIYAN WANG et. al. | ArXiv | 2024-12-11 |
| 762 | Superpixel-Based Hierarchical Graph Convolution for Enhanced Interpretability in Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Despite the success of pixel-based black-box models in most semantic segmentation algorithms for autonomous driving technology, identifying the causes of recognition errors … |
Yimeng Dong; Guangyao Liu; Yankai Yin; Mengxuan Wu; Feng Duan; | 2024 IEEE International Conference on Robotics and … | 2024-12-10 |
| 763 | Channel Selection and Local Attention Transformer Model for Semantic Segmentation on UAV Remote Sensing Scene Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Compared with common urban landscape semantic segmentation, unmanned aerial vehicle (UAV) image semantic segmentation is more challenging because small targets have very low pixel … |
Da Liu; Hao Long; Zhenbao Liu; | IET Image Process. | 2024-12-09 |
| 764 | Semantic Segmentation and Spatial Relationship Modeling in Hyperspectral Imagery Using Deep Learning and Graph-Based Representations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The effective analysis of spatial data from diverse sources, such as satellite imagery and aerial views, remains pivotal for informed decision-making across various domains. This … |
Ravikumar Yenni; Arun P V; | 2024 14th Workshop on Hyperspectral Imaging and Signal … | 2024-12-09 |
| 765 | GCUNet: A GNN-Based Contextual Learning Network for Tertiary Lymphoid Structure Semantic Segmentation in Whole Slide Image Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, this prevents the model from accessing information outside of the patches, limiting the performance. To address this issue, we propose GCUNet, a GNN-based contextual learning network for TLS semantic segmentation. |
Lei Su; Yang Du; | arxiv-cs.CV | 2024-12-08 |
| 766 | Efficient Semantic Splatting for Remote Sensing Multiview Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Remote sensing multiview image segmentation is essential for achieving accurate and consistent stereoscopic perception of target scenes. This task involves processing RGB images … |
Zipeng Qi; Hao Chen; Haotian Zhang; Zhengxia Zou; Z. Shi; | IEEE Transactions on Geoscience and Remote Sensing | 2024-12-08 |
| 767 | LULC-SegNet: Enhancing Land Use and Land Cover Semantic Segmentation with Denoising Diffusion Feature Fusion IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Deep convolutional networks often encounter information bottlenecks when extracting land object features, resulting in critical geometric information loss, which impedes semantic … |
Zongwen Shi; Junfu Fan; Yujie Du; Yuke Zhou; Yi Zhang; | Remote. Sens. | 2024-12-06 |
| 768 | Point-GR: Graph Residual Point Cloud Network for 3D Object Classification and Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents Point-GR, a novel deep learning architecture designed explicitly to transform unordered raw point clouds into higher dimensions while preserving local geometric features. |
Md Meraz; Md Afzal Ansari; Mohammed Javed; Pavan Chakraborty; | arxiv-cs.CV | 2024-12-04 |
| 769 | CA-OVS: Cluster and Adapt Mask Proposals for Open-Vocabulary Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Recent Open-Vocabulary Semantic Segmentation (OVS) works typically follow the mask proposal pipeline that decouples semantic segmentation into class-agnostic mask generation and … |
S. D. Dao; Hengcan Shi; Dinh Q. Phung; Jianfei Cai; | Proceedings of the 6th ACM International Conference on … | 2024-12-03 |
| 770 | MFSegDiff: A Multi-Frequency Diffusion Model for Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Medical image segmentation accurately identifies and delineates diagnostic regions, which is a crucial step in the early detection and accurate diagnosis of diseases. Diffusion … |
Zidi Shi; Hua Zou; Fei Luo; Zhiyu Huo; | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 771 | MedSegViG: Medical Image Segmentation with A Vision Graph Neural Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Medical image segmentation is a crucial step toward automatic clinical diagnosis, which has received growing interest. Although some existing methods based on convolutional neural … |
XINHONG LI et. al. | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 772 | Progressive Stepwise Diffusion Model with Dual Decoders for Semi-Supervised Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semi-supervised medical image segmentation tasks aim to harness the potential of vast amounts of unlabeled data using a limited amount of annotated data. Denoising Diffusion … |
XIAOLIN HUANG et. al. | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 773 | SJTU:Spatial Judgments in Multimodal Models Towards Unified Segmentation Through Coordinate Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces SJTU: Spatial Judgments in multimodal models – Towards Unified segmentation through coordinate detection, a novel framework that leverages spatial coordinate understanding to bridge vision-language interaction and precise segmentation, enabling accurate target identification through natural language instructions. |
Joongwon Chae; Zhenyu Wang; Peiwu Qin; | arxiv-cs.CV | 2024-12-03 |
| 774 | Mamba-SAM: An Adaption Framework for Accurate Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The Segment Anything Model (SAM) shows strong performance in natural images but struggles with medical images due to a significant semantic gap and characteristics like … |
YIFENG WU et. al. | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 775 | Semantic Segmentation Prior for Diffusion-Based Real-World Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Additionally, the same region may have a strong response to more than one prompt and it will lead to semantic ambiguity for image super-resolution. To alleviate the above two issues, in this paper, we propose to consider semantic segmentation as an additional control condition into diffusion-based image super-resolution. |
JIAHUA XIAO et. al. | arxiv-cs.CV | 2024-12-03 |
| 776 | Test-Time Medical Image Segmentation Using CLIP-Guided SAM Adaptation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Test-time medical image segmentation is a critical component in clinical practice, enabling pre-trained medical segmentation models to effectively adapt unseen medical samples … |
Haotian Chen; Yonghui Xu; Yanyu Xu; Yixin Zhang; Li-zhen Cui; | 2024 IEEE International Conference on Bioinformatics and … | 2024-12-03 |
| 777 | ECSeg: Edge-Cloud Switched Image Segmentation for Autonomous Vehicles Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Existing autonomous vehicles have not utilized the cloud computing for execution of their deep learning-based driving tasks due to the long vehicle-to-cloud communication latency. … |
Siyuan Zhou; D. V. Le; Rui Tan; | 2024 21st Annual IEEE International Conference on Sensing, … | 2024-12-02 |
| 778 | Advancing Perturbation Space Expansion Based on Information Fusion for Semi-supervised Remote Sensing Image Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Liang Zhou; Keyi Duan; Jinkun Dai; Yuanxin Ye; | Inf. Fusion | 2024-12-01 |
| 779 | RailEINet:A Novel Scene Segmentation Network for Automatic Train Operation Based on Feature Alignment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Tao Sun; Baoqing Guo; Tao Ruan; Xingfang Zhou; Dingyuan Bai; | Eng. Appl. Artif. Intell. | 2024-12-01 |
| 780 | Domain Adaptation Transformer for Unsupervised Driving-Scene Segmentation in Adverse Conditions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation in driving scenarios is important for modern autonomous driving technology. While the existing methods have shown promising results in segmenting … |
Wenyu Liu; Song Wang; Jianke Zhu; Xuansong Xie; Lei Zhang; | IEEE Transactions on Intelligent Transportation Systems | 2024-12-01 |
| 781 | Density-aware Global-Local Attention Network for Point Cloud Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The point cloud data collected in real scenes often contain small objects and categories with small sample sizes, which are difficult to handle by existing networks. In this regard, we propose a point cloud segmentation network that fuses local attention based on density perception with global attention. |
Chade Li; Pengju Zhang; Yihong Wu; | arxiv-cs.CV | 2024-11-30 |
| 782 | GradiSeg: Gradient-Guided Gaussian Segmentation with Enhanced 3D Boundary Precision IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While 3D Gaussian Splatting enables high-quality real-time rendering, existing Gaussian-based frameworks for 3D semantic segmentation still face significant challenges in boundary recognition accuracy. To address this, we propose a novel 3DGS-based framework named GradiSeg, incorporating Identity Encoding to construct a deeper semantic understanding of scenes. |
ZEHAO LI et. al. | arxiv-cs.CV | 2024-11-30 |
| 783 | LMSeg: Unleashing The Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose to alleviate the above-mentioned issues by leveraging multiple large-scale models to enhance the alignment between fine-grained visual features and enriched linguistic features. |
HUADONG TANG et. al. | arxiv-cs.CV | 2024-11-30 |
| 784 | Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present FreeGS, an unsupervised semantic-embedded 3DGS framework that achieves view-consistent 3D scene understanding without the need for 2D labels. |
WENBO ZHANG et. al. | arxiv-cs.CV | 2024-11-29 |
| 785 | Enhancing Semantic Segmentation with Synthetic Image Generation: A Novel Approach Using Stable Diffusion and ControlNet Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper presents a novel methodology for generating synthetic images that adhere accurately to provided semantic segmentation maps using the Stable Diffusion model with the … |
Austin Bevacqua; Tanmay Singha; Duc-Son Pham; | 2024 International Conference on Digital Image Computing: … | 2024-11-27 |
| 786 | OUTBACK: A Multimodal Synthetic Dataset for Rural Australian Off-road Robot Navigation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: One of the most important aspects of robot scene understanding is semantic segmentation of external environments. Urban environment semantic segmentation has been extensively … |
Liyana Wijayathunga; Dulitha Dabare; A. Rassau; Douglas Chai; S. Islam; | 2024 International Conference on Digital Image Computing: … | 2024-11-27 |
| 787 | Semantic Image Segmentation of Cell Volumes Using 3D U-Net Convolutional Neural Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics. Traditionally image … |
LAZAR DASIC et. al. | 2024 IEEE 24th International Conference on Bioinformatics … | 2024-11-27 |
| 788 | PFFNet: A Pyramid Feature Fusion Network for Microaneurysm Segmentation in Fundus Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Retinal microaneurysm (MA) is a definite earliest clinical sigh of diabetic retinopathy (DR). Its automatic segmentation is key to realizing intelligent screening for early DR, … |
JIAXIN LU et. al. | IET Image Process. | 2024-11-27 |
| 789 | HoliSDiP: Image Super-Resolution Via Holistic Semantics and Diffusion Prior Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Text-to-image diffusion models have emerged as powerful priors for real-world image super-resolution (Real-ISR). However, existing methods may produce unintended results due to … |
LI-YUAN TSAO et. al. | ArXiv | 2024-11-27 |
| 790 | Box for Mask and Mask for Box: Weak Losses for Multi-task Partially Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose Box-for-Mask and Mask-for-Box strategies, and their combination BoMBo, to distil necessary information from one task annotations to train the other. |
Hoàng-Ân Lê; Paul Berg; Minh-Tan Pham; | arxiv-cs.CV | 2024-11-26 |
| 791 | A Performance Increment Strategy for Semantic Segmentation of Low-Resolution Images from Damaged Roads Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: A representative dataset for emerging countries consists of low-resolution images of poorly maintained roads and includes labels of damage classes; in this scenario, three challenges arise: objects with few pixels, objects with undefined shapes, and highly underrepresented classes. To tackle these challenges, this work proposes the Performance Increment Strategy for Semantic Segmentation (PISSS) as a methodology of 14 training experiments to boost performance. |
Rafael S. Toledo; Cristiano S. Oliveira; Vitor H. T. Oliveira; Eric A. Antonelo; Aldo von Wangenheim; | arxiv-cs.CV | 2024-11-25 |
| 792 | Learn from Foundation Model: Fruit Detection Model Without Manual Annotation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Recent breakthroughs in large foundation models have enabled the possibility of transferring knowledge pre-trained on vast datasets to domains with limited data availability. … |
Yanan Wang; Zhenghao Fei; Ruichen Li; Yibin Ying; | ArXiv | 2024-11-25 |
| 793 | Effective SAM Combination for Open-Vocabulary Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose ESC-Net, a novel one-stage open-vocabulary segmentation model that leverages the SAM decoder blocks for class-agnostic segmentation within an efficient inference framework. |
MINHYEOK LEE et. al. | arxiv-cs.CV | 2024-11-21 |
| 794 | BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing 3D benchmarking datasets typically evaluate deep learning models under the assumption that training and test data are independently and identically distributed (IID), which affects the models’ usability for real-world point cloud segmentation. To address these challenges, we introduce the BelHouse3D dataset, a new synthetic point cloud dataset designed for 3D indoor scene semantic segmentation. |
Umamaheswaran Raman Kumar; Abdur Razzaq Fayjie; Jurgen Hannaert; Patrick Vandewalle; | arxiv-cs.CV | 2024-11-20 |
| 795 | SAM Carries The Burden: A Semi-Supervised Approach Refining Pseudo Labels for Medical Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The recently introduced Segment Anything Model (SAM) enables prompt-based segmentation and offers zero-shot generalization to unfamiliar objects. |
RON KEUTH et. al. | arxiv-cs.CV | 2024-11-19 |
| 796 | TP-UNet: Temporal Prompt Guided UNet for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To efficiently integrate temporal information, we propose TP-UNet that utilizes temporal prompts, encompassing organ-construction relationships, to guide the segmentation UNet model. |
Ranmin Wang; Limin Zhuang; Hongkun Chen; Boyan Xu; Ruichu Cai; | arxiv-cs.CV | 2024-11-18 |
| 797 | Calibrated and Efficient Sampling-Free Confidence Estimation for LiDAR Scene Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a sampling-free approach for estimating well-calibrated confidence values for classification tasks, achieving alignment with true classification accuracy and significantly reducing inference time compared to sampling-based methods. |
Hanieh Shojaei Miandashti; Qianqian Zou; Claus Brenner; | arxiv-cs.CV | 2024-11-18 |
| 798 | ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we enhance the semantic segmentation performance of CLIP by introducing new modules and modifications: 1) architectural changes in the last layer of ViT and the incorporation of attention maps from the middle layers with the last layer, 2) Image Engineering: applying data augmentations to enrich input image representations, and 3) using Large Language Models (LLMs) to generate definitions and synonyms for each class name to leverage CLIP’s open-vocabulary capabilities. |
M. Arda Aydın; Efe Mert Çırpar; Elvin Abdinli; Gozde Unal; Yusuf H. Sahin; | arxiv-cs.CV | 2024-11-18 |
| 799 | ULTra: Unveiling Latent Token Interpretability in Transformer-Based Understanding and Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, their complexity makes latent token representations difficult to interpret. We introduce ULTra, a framework for interpreting Transformer embeddings and uncovering meaningful semantic patterns within them. |
Hesam Hosseini; Ghazal Hosseini Mighan; Amirabbas Afzali; Sajjad Amini; Amir Houmansadr; | arxiv-cs.CV | 2024-11-15 |
| 800 | ULTra: Unveiling Latent Token Interpretability in Transformer Based Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Transformers have revolutionized Computer Vision (CV) through self-attention mechanisms. However, their complexity makes latent token representations difficult to interpret. We … |
Hesam Hosseini; Ghazal Hosseini Mighan; Amirabbas Afzali; Sajjad Amini; Amir Houmansadr; | ArXiv | 2024-11-15 |
| 801 | Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Specifically, we introduce Trident, a training-free framework that first splices features extracted by CLIP and DINO from sub-images, then leverages SAM’s encoder to create a correlation matrix for global aggregation, enabling a broadened receptive field for effective segmentation. |
Yuheng Shi; Minjing Dong; Chang Xu; | arxiv-cs.CV | 2024-11-14 |
| 802 | Contrastive Patch Comparison Transformer for Weakly Supervised Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Weakly supervised semantic segmentation (WSSS) focuses on utilizing image-level labels to classify all pixels into object classes or background class. Class Activation Mapping … |
Wentian Cai; Weixian Yang; Jing Lin; Ying Gao; | 2024 IEEE International Conference on Smart Internet of … | 2024-11-14 |
| 803 | Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Vision Transformers (ViT) have recently brought a new wave of research in the field of computer vision. These models have performed particularly well in image classification and segmentation. |
Ashim Dahal; Saydul Akbar Murad; Nick Rahimi; | arxiv-cs.CV | 2024-11-13 |
| 804 | Slender Object Scene Segmentation in Remote Sensing Image Based on Learnable Morphological Skeleton with Segment Anything Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new approach that integrates learnable morphological skeleton prior into deep neural networks using the variational method. |
JUN XIE et. al. | arxiv-cs.CV | 2024-11-13 |
| 805 | Zero-shot Capability of SAM-family Models for Bone Segmentation in CT Scans Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The Segment Anything Model (SAM) and similar models build a family of promptable foundation models (FMs) for image and video segmentation. |
Caroline Magg; Hoel Kervadec; Clara I. Sánchez; | arxiv-cs.CV | 2024-11-13 |
| 806 | Semantic Segmentation with Attention-Modulated Feature Fusion in HRNET V2 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is pivotal for precise object identification and localization within images, a cornerstone for automated analysis and machine vision. Despite advancements, … |
Weijie Zhang; Shuhei Kaneko; Shuichi Arai; | 2024 International Symposium on Information Theory and Its … | 2024-11-10 |
| 807 | Superpixel Segmentation: A Long-Lasting Ill-Posed Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Concurrently, recent deep learning-based superpixelmethods mainly focus on the object segmentation task at the expense ofregularity. In this ill-posed context, we show that we can achieve competitiveresults using a recent architecture like the Segment Anything Model (SAM),without dedicated training for the superpixel segmentation task. |
Rémi Giraud; Michaël Clément; | arxiv-cs.CV | 2024-11-10 |
| 808 | ZAHA: Introducing The Level of Facade Generalization and The Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In ZAHA, we introduce Level of Facade Generalization (LoFG), novel hierarchical facade classes designed based on international urban modeling standards, ensuring compatibility with real-world challenging classes and uniform methods’ comparison. |
OLAF WYSOCKI et. al. | arxiv-cs.CV | 2024-11-07 |
| 809 | OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the task, we propose a plug-and-play approach termed OLAF. |
Pranav Gupta; Rishubh Singh; Pradeep Shenoy; Ravikiran Sarvadevabhatla; | arxiv-cs.CV | 2024-11-05 |
| 810 | Panoramic Image Semantic Segmentation Using Channel Attention-based HarDNet and Distorted Boundary Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Xun Jin; Chongyang Zhu; De Li; | Multim. Syst. | 2024-11-01 |
| 811 | Enhanced Scene Understanding and Situation Awareness for Autonomous Vehicles Based on Semantic Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Accurate visual perception and comprehensive scene understanding are critical for the safety and reliability of autonomous vehicles (AVs). Nevertheless, the efficacy of visual … |
YIYUE ZHAO et. al. | IEEE Transactions on Systems, Man, and Cybernetics: Systems | 2024-11-01 |
| 812 | Temporal Consistency for RGB-Thermal Data-Based Semantic Scene Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic scene understanding is a fundamental capability for autonomous vehicles. Under challenging lighting conditions, such as nighttime and on-coming headlights, the semantic … |
Haotian Li; Henry K. Chu; Yuxiang Sun; | IEEE Robotics and Automation Letters | 2024-11-01 |
| 813 | L-DeeplabV3+: A Lightweight Semantic Segmentation Algorithm for Complex Scene Perception Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Abstract. Current semantic segmentation algorithms are often burdened by high computational complexity and inadequate boundary localization accuracy in complex scenarios of … |
ZHENGSHUN FEI et. al. | Journal of Electronic Imaging | 2024-11-01 |
| 814 | Cityscape-Adverse: Benchmarking Robustness of Semantic Segmentation with Realistic Scene Modifications Via Diffusion-Based Image Editing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce Cityscape-Adverse, a benchmark that employs diffusion-based image editing to simulate eight adverse conditions, including variations in weather, lighting, and seasons, while preserving the original semantic labels. |
NAUFAL SURYANTO et. al. | arxiv-cs.CV | 2024-11-01 |
| 815 | Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In response, we propose the Class-Aware Semantic Diffusion Model (CASDM), a novel approach which utilizes segmentation maps as conditions for image synthesis to tackle data scarcity and imbalance. |
Yihang Zhou; Rebecca Towning; Zaid Awad; Stamatia Giannarou; | arxiv-cs.CV | 2024-10-31 |
| 816 | Interactive Segmentation By Considering First-Click Intentional Ambiguity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Given the fact that most of the related algorithms generate a single mask only, the robustness of which might be constrained due to the diversity of user intention in the early interaction stage, namely the vague selection of object part/whole object/adherent object, especially when there’s only one click. To handle this, we propose a novel framework called Diversified Interactive Segmentation Network (DISNet) in which we revisit the peculiarity of first-click: given an input image, DISNet outputs multiple candidate masks under the guidance of first-click only, then a Dual-attentional Mask Correction (DAMC) module is utilized to measure the complex mutual effect within first-click, all-clicks and image features. |
Kangpeng Hu; Quansen Sun; Yinghui Sun; Tao Wang; | mm | 2024-10-30 |
| 817 | Few-shot Semantic Segmentation Via Perceptual Attention and Spatial Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, due to probabilistic noising and denoising processes, it is difficult for them to maintain spatial relationships between inputs and outputs, leading to inaccurate segmentation masks. To address this issue, we propose a Diffusion-based Segmentation network (DiffSeg), which decouples probabilistic denoising and segmentation processes. |
GUANGCHEN SHI et. al. | mm | 2024-10-30 |
| 818 | Multi-fineness Boundaries and The Shifted Ensemble-aware Encoding for Point Cloud Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: There is limited focus on explicitly addressing semantic segmentation of point cloud boundaries. We introduce a method called Multi-fineness Boundary Constraint (MBC) to tackle this challenge. |
ZIMING WANG et. al. | mm | 2024-10-30 |
| 819 | ProgressiveGlassNet:Glass Detection with Progressive Decoder Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Glass image segmentation is a branch of semantic segmentation, and it has great similarity with semantic segmentation. Decoder is the key to semantic segmentation network to … |
Qingling Chang; Xiaofei Meng; Zhiyong Hong; Yan Cui; | 2024 IEEE International Symposium on Parallel and … | 2024-10-30 |
| 820 | GS2-GNeSF: Geometry-Semantics Synergy for Generalizable Neural Semantic Fields Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches to generalizable NeSF fall short in fully exploiting the geometric and semantic features as well as their mutual interactions, resulting in suboptimal performance in both novel-view image synthesis and semantic segmentation. To address this limitation, we propose Geometry-Semantics Synergy for Generalized Neural Semantic Fields (GS2-GNeSF), a novel approach aimed at improving the performance of generalizable NeSF through the comprehensive construction and synergistic interaction of geometric and semantic features. |
Chengshun Wang; Na Zhao; | mm | 2024-10-30 |
| 821 | S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose S3PT a novel scene semantics and structure guided clustering to provide more scene-consistent objectives for self-supervised training. |
MACIEJ K. WOZNIAK et. al. | arxiv-cs.CV | 2024-10-30 |
| 822 | Anatomical Prior Guided Spatial Contrastive Learning for Few-Shot Medical Image Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an anatomical prior guided spatial contrastive learning, called APSCL, which exploits anatomical prior knowledge derived from medical images to construct contrastive learning from a spatial perspective for few-shot medical image segmentation. |
Wendong Huang; Jinwu Hu; Xiuli Bi; Bin Xiao; | mm | 2024-10-30 |
| 823 | 3D Scene De-occlusion in Neural Radiance Fields: A Framework for Obstacle Removal and Realistic Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the performance of these works have been validated for data collected in a narrow range of multi-view, while degrade for the wide range of multi-view. To address this problem, we propose a novel NeRF framework to remove the obstacle and reproduce occluded areas in high quality for both wide and narrow range of multi-view. |
Yi Liu; Xinyi Li; Wenjing Shuai; | mm | 2024-10-30 |
| 824 | Generalized Source-Free Domain-adaptive Segmentation Via Reliable Knowledge Propagation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on a more challenging paradigm in semantic segmentation, Generalized SFDA (G-SFDA), aiming to achieve robust performance on both source and target domains. |
QI ZANG et. al. | mm | 2024-10-30 |
| 825 | Crossmodal Few-shot 3D Point Cloud Semantic Segmentation Via View Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, previous methods use single-view point cloud generation algorithms to bridge the gap between 2D images and 3D point clouds, leaving the incomplete geometry of an object or scene due to occlusions. To address this issue, we propose a novel view synthesis cross-modal few-shot point cloud semantic segmentation network. |
Ziyu Zhao; Pingping Cai; Canyu Zhang; Xiaoguang Li; Song Wang; | mm | 2024-10-30 |
| 826 | RefMask3D: Language-Guided Transformer for 3D Referring Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose RefMask3D to explore the comprehensive multi-modal feature interaction and understanding. |
Shuting He; Henghui Ding; | mm | 2024-10-30 |
| 827 | ESNet: An Efficient Real-time Semantic Segmentation Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Efficient image segmentation algorithms are critical in computer vision, as they maintain high processing speeds while handling large amounts of data and providing practical … |
Renping Xie; Cong He; Ming Tao; Kai Ding; | 2024 IEEE International Symposium on Parallel and … | 2024-10-30 |
| 828 | Semantic Segmentation of River Video for Smart River Monitoring System Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In this work, a high-efficiency semantic segmentation method is proposed for a smart river monitoring surveillance system. The proposed model is trained by an original river … |
Haruki Inoue; Takafumi Katayama; Tian Song; T. Shimamoto; | 2024 IEEE 13th Global Conference on Consumer Electronics … | 2024-10-29 |
| 829 | Text2Seg: Zero-shot Remote Sensing Image Semantic Segmentation Via Text-Guided Visual Foundation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
JIELU ZHANG et. al. | GeoAI@SIGSPATIAL | 2024-10-29 |
| 830 | Diffusion-driven Cycle-consistent Domain Adaptation for Cross-modality Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Medical image segmentation often suffers from performance degradation when applied to images from different domains. To address this, we propose DiMA-Seg (Diffusion Model … |
Hang Su; Renshu Gu; Xiangyang Wu; M. Toyoura; Gang Xu; | 2024 International Conference on Cyberworlds (CW) | 2024-10-29 |
| 831 | LDCNet: Long-Distance Context Modeling for Large-Scale 3D Point Cloud Scene Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Shoutong Luo; Zhengxing Sun; Yi Wang; Yunhan Sun; Chendi Zhu; | ACM Multimedia | 2024-10-28 |
| 832 | Automatic Semantic Segmentation and Classification of Remote Sensing Image Data for Flood Detection Using Novel LSTM Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Amruta Sonavale; Midhun Chakkaravarthy; Surampudi Srinivasa Rao; H. B. M. Salleh; Jagannath Jadhav; | SN Comput. Sci. | 2024-10-28 |
| 833 | Semantic-Enhanced Point-Box Joint Prompting for Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Quan Zhao; Siying Wu; Yueyi Zhang; Xiaoyan Sun; | International Conference on Information Photonics | 2024-10-27 |
| 834 | MSD-Net: A Multi-Scale Semantic Segmentation Method for Images with Metal Artifact Interference Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The semantic segmentation for computed tomography (CT) images is an important step in clinical diagnosis. Metal artifacts may cause significant segmentation errors, and even after … |
Chaoyue Zhang; Fanning Kong; Yudong Hao; Qingjie Cao; Zaifeng Shi; | 2024 17th International Congress on Image and Signal … | 2024-10-26 |
| 835 | Semantic Segmentation and Scene Reconstruction of RGB-D Image Frames: An End-to-End Modular Pipeline for Robotic Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel end-to-end modular pipeline that integrates state-of-the-art semantic segmentation, human tracking, point-cloud fusion, and scene reconstruction. |
ZHIWU ZHENG et. al. | arxiv-cs.CV | 2024-10-23 |
| 836 | Surgical Scene Segmentation By Transformer With Asymmetric Feature Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Secondly, the specific characteristics of anatomy and instruments are not specifically modeled. To tackle the above challenges, we propose a novel Transformer-based framework with an Asymmetric Feature Enhancement module (TAFE), which enhances local information and then actively fuses the improved feature pyramid into the embeddings from transformer encoders by a multi-scale interaction attention strategy. |
Cheng Yuan; Yutong Ban; | arxiv-cs.CV | 2024-10-23 |
| 837 | Multi Kernel Estimation Based Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents a novel approach for multi-kernel estimation by enhancing the KernelGAN algorithm, which traditionally estimates a single kernel for the entire image. |
Haim Goldfisher; Asaf Yekutiel; | arxiv-cs.CV | 2024-10-22 |
| 838 | Progressive Semantic Consistency Towards Unsupervised Cross-Modality Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Although deep neural networks have made great progress in the field of multimodal medical image segmentation, they are severely limited by 1) the need to use sufficient dataset … |
Mengyue Wang; Deqing Zhang; | 2024 5th International Conference on Machine Learning and … | 2024-10-18 |
| 839 | TICNet: Three-Branch Real-Time Semantic Segmentation Network with Intensive Compensation of Railway Track Scenes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the rapid development of railway traffic system, real-time semantic segmentation plays a crucial role in railway track scene monitoring. However, most of the existing methods … |
Yiwen Bai; Lu Yang; Lei Zhang; Yajing Song; | 2024 5th International Conference on Machine Learning and … | 2024-10-18 |
| 840 | RDFormer: Efficient RGB-D Semantic Segmentation in Complex Outdoor Scenes Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: RGB-D data can now be easily acquired from vehicle-mounted sensors. This paper introduces RDFormer, an improved semantic segmentation method based on existing image segmentation … |
Zhiyong Peng; Yongjie Zheng; Yuhao Cheng; Yulong Qiao; | 2024 5th International Conference on Machine Learning and … | 2024-10-18 |
| 841 | Stroke-Seg: A Deep Learning-based Framework for Chinese Stroke Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Chinese stroke segmentation is a crucial and challenging task for various downstream applications such as font generation, aesthetic evaluation etc. Conventional semantic … |
Xinyu Gong; Zeyang Bai; Haitao Nie; Bin Xie; | IET Image Process. | 2024-10-18 |
| 842 | SemSim: Revisiting Weak-to-Strong Consistency from A Semantic Similarity Perspective for Semi-supervised Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, two key limitations still persist, impeding its efficient adaptation: (1) the neglect of contextual dependencies results in inconsistent predictions for similar semantic features, leading to incomplete object segmentation; (2) the lack of exploitation of semantic similarity between labeled and unlabeled data induces considerable class-distribution discrepancy. To address these limitations, we propose a novel semi-supervised framework based on FixMatch, named SemSim, powered by two appealing designs from semantic similarity perspective: (1) rectifying pixel-wise prediction by reasoning about the intra-image pair-wise affinity map, thus integrating contextual dependencies explicitly into the final prediction; (2) bridging labeled and unlabeled data via a feature querying mechanism for compact class representation learning, which fully considers cross-image anatomical similarities. |
SHIAO XIE et. al. | arxiv-cs.CV | 2024-10-17 |
| 843 | Railway LiDAR Semantic Segmentation Based on Intelligent Semi-automated Data Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we propose an approach for a point-wise 3D semantic segmentation based on the 2DPass network architecture using scans and images jointly. |
Florian Wulff; Bernd Schaeufele; Julian Pfeifer; Ilja Radusch; | arxiv-cs.CV | 2024-10-17 |
| 844 | Adaptive Prompt Learning with SAM for Few-shot Scanning Probe Microscope Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Code and dataset used in this study will be made available upon acceptance. |
YAO SHEN et. al. | arxiv-cs.CV | 2024-10-16 |
| 845 | RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Using our model and dataset, we propose RClicks benchmark for a comprehensive comparison of existing interactive segmentation methods on realistic clicks. |
ANTON ANTONOV et. al. | arxiv-cs.CV | 2024-10-15 |
| 846 | Weakly Scene Segmentation Using Efficient Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Current methods for large-scale point cloud scene semantic segmentation rely on manually annotated dense point-wise labels, which are costly, labor-intensive, and prone to errors. … |
Hao Huang; Shuaihang Yuan; Congcong Wen; Yu Hao; Yi Fang; | 2024 IEEE/RSJ International Conference on Intelligent … | 2024-10-14 |
| 847 | Real-Time Semantic Segmentation in Natural Environments with SAM-assisted Sim-to-Real Domain Transfer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation plays a pivotal role in many robotic applications requiring high-level scene understanding, such as smart farming, where the precise identification of trees … |
Han Wang; R. Mascaro; M. Chli; L. Teixeira; | 2024 IEEE/RSJ International Conference on Intelligent … | 2024-10-14 |
| 848 | Multi-View Graph Neural Network for Semantic Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic image segmentation is a fundamental task in computer vision, frequently addressed using deep learning techniques. Nevertheless, these methods often struggle to fully … |
Elie Karam; N. Jrad; Patty Coupeau; Jean-Baptiste Fasquel; Fahed Abdallah; | 2024 IEEE Thirteenth International Conference on Image … | 2024-10-14 |
| 849 | LKASeg:Remote-Sensing Image Semantic Segmentation with Large Kernel Attention and Full-Scale Skip Connections Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a remote-sensing image semantic segmentation network named LKASeg, which combines Large Kernel Attention(LSKA) and Full-Scale Skip Connections(FSC). |
XUEZHI XIANG et. al. | arxiv-cs.CV | 2024-10-14 |
| 850 | Semantic Segmentation of Persons in Point Clouds from Photon Counting Lidar Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Segmentation of persons at large distances is desired for surveillance and reconnaissance in security and defense applications. Today lidar sensors can provide high-resolution … |
Maria Axelsson; | 2024 IEEE International Conference on Imaging Systems and … | 2024-10-14 |
| 851 | Semantic Segmentation of Sentinel-2 Satellite Image for Rice Growth Phase Classification Using Deep Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Sentinel-2 satellite image is one of the big data for remote sensing. This research explores the usage of U-Net model for segmentation of rice fields and CNN Model for … |
Moh. Jabir Mubarok; E. M. Yuniarno; Yogie Oktavianus Sihombing; M. Purnomo; | 2024 IEEE International Conference on Imaging Systems and … | 2024-10-14 |
| 852 | An Object-Aware Network Embedding Deep Superpixel for Semantic Segmentation of Remote Sensing Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation forms the foundation for understanding very high resolution (VHR) remote sensing images, with extensive demand and practical application value. The … |
ZIRAN YE et. al. | Remote. Sens. | 2024-10-13 |
| 853 | High-Precision Dichotomous Image Segmentation Via Probing Diffusion Capacity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To this end, we propose DiffDIS, adiffusion-driven segmentation model that taps into the potential of thepre-trained U-Net within diffusion models, specifically designed forhigh-resolution, fine-grained object segmentation. |
QIAN YU et. al. | arxiv-cs.CV | 2024-10-13 |
| 854 | VideoSAM: Open-World Video Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we introduce VideoSAM, an end-to-end framework designed to address these challenges by improving object tracking and segmentation consistency in dynamic environments. |
PINXUE GUO et. al. | arxiv-cs.CV | 2024-10-11 |
| 855 | Uncertainty Estimation and Out-of-Distribution Detection for LiDAR Scene Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a method to distinguish in-distribution (ID) from OOD samples and quantify both epistemic and aleatoric uncertainties using the feature space of a single deterministic model. |
Hanieh Shojaei; Qianqian Zou; Max Mehltretter; | arxiv-cs.LG | 2024-10-11 |
| 856 | Data Augmentation for Surgical Scene Segmentation with Anatomy-Aware Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a multi-stage approach using diffusion models to generate multi-class surgical datasets with annotations. |
Danush Kumar Venkatesh; Dominik Rivoir; Micha Pfeiffer; Fiona Kolbinger; Stefanie Speidel; | arxiv-cs.CV | 2024-10-10 |
| 857 | Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These characteristics hinder the real-time semantic analysis, particularly on resource-constrained hardware architectures that constitute the main computational components of numerous robotic applications. Therefore, in this paper, we investigate various 3D semantic segmentation methodologies and analyze their performance and capabilities for resource-constrained inference on embedded NVIDIA Jetson platforms. |
Samir Abou Haidar; Alexandre Chariot; Mehdi Darouich; Cyril Joly; Jean-Emmanuel Deschaud; | arxiv-cs.RO | 2024-10-10 |
| 858 | Shift and Matching Queries for Video Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a method to extend a query-based image segmentation model to video using feature shift and query matching. |
Tsubasa Mizuno; Toru Tamaki; | arxiv-cs.CV | 2024-10-10 |
| 859 | Evaluating The Impact of Point Cloud Colorization on Semantic Segmentation Accuracy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel statistical approach to evaluate the impact of inaccurate RGB information on image-based point cloud segmentation. |
Qinfeng Zhu; Jiaze Cao; Yuanzhi Cai; Lei Fan; | arxiv-cs.CV | 2024-10-09 |
| 860 | Rethinking The Evaluation of Visible and Infrared Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes a Segmentation-oriented Evaluation Approach (SEA) to assess VIF methods by incorporating the semantic segmentation task and leveraging segmentation labels available in latest VIF datasets. |
Dayan Guan; Yixuan Wu; Tianzhu Liu; Alex C. Kot; Yanfeng Gu; | arxiv-cs.CV | 2024-10-09 |
| 861 | Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these models face challenges in dealing with intricate scenes, primarily due to the heterogeneity between RGB and thermal modalities. To address this gap, we present Open-RGBT, a novel open-vocabulary RGB-T semantic segmentation model. |
Meng Yu; Luojie Yang; Xunjie He; Yi Yang; Yufeng Yue; | arxiv-cs.CV | 2024-10-09 |
| 862 | Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce *Scribbles for All*, a label and training data generation algorithm for semantic segmentation trained on scribble labels. |
Wolfgang Boettcher; Lukas Hoyer; Ozan Unal; Jan Eric Lenssen; Bernt Schiele; | nips | 2024-10-07 |
| 863 | Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel method, MCLIP, to adapt the CLIP image encoder for pixel-level understanding by guiding the model on where, which is achieved using unlabeled images and masks generated from vision foundation models such as SAM and DINO. |
HEESEONG SHIN et. al. | nips | 2024-10-07 |
| 864 | Toward Real Ultra Image Segmentation: Leveraging Surrounding Context to Cultivate General Segmentation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing ultra image segmentation methods suffer from two major challenges, namely the generalization issue (i.e. they lack the stability and generality of standard segmentation models, as they are tailored to specific datasets), and the architectural issue (i.e. they are incompatible with real-world ultra image scenes, as they compromise between image size and computing resources). To tackle these issues, we revisit the classic sliding inference framework, upon which we propose a Surrounding Guided Segmentation framework (SGNet) for ultra image segmentation. |
Sai Wang; Yutian Lin; Yu Wu; Bo Du; | nips | 2024-10-07 |
| 865 | Geometric Exploitation for Indoor Panoramic Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unlike previous works, in this paper, we propose a novel approach for semantic segmentation of panoramic images. |
Duc Cao Dinh; Seok Kim; Kyusung Cho; | nips | 2024-10-07 |
| 866 | A Unified Framework for 3D Scene Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose UniSeg3D, a unified 3D segmentation framework that achieves panoptic, semantic, instance, interactive, referring, and open-vocabulary semantic segmentation tasks within a single model. |
WEI XU et. al. | nips | 2024-10-07 |
| 867 | Zero-Shot Image Segmentation Via Recursive Normalized Cut on Diffusion Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider a diffusion UNet encoder as a foundation vision encoder and we introduce DiffCut, an unsupervised zero-shot segmentation method that solely harnesses the output features from the final self-attention block. |
Paul Couairon; Mustafa Shukor; Jean-Emmanuel HAUGEARD; Matthieu Cord; Nicolas THOME; | nips | 2024-10-07 |
| 868 | Unsupervised Hierarchy-Agnostic Segmentation: Parsing Semantic Image Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel algebraic methodology for unsupervised image segmentation. |
Simone Rossetti; fiora pirri; | nips | 2024-10-07 |
| 869 | One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce VideoLISA, a video-based multimodal large language model designed to tackle the problem of language-instructed reasoning segmentation in videos. |
ZECHEN BAI et. al. | nips | 2024-10-07 |
| 870 | Relationship Prompt Learning Is Enough for Open-Vocabulary Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Prompt learning offers a direct and parameter-efficient approach, yet it falls short in guiding VLM for pixel-level visual localization. Therefore, we propose relationship prompt module (RPM), which generates relationship prompt that directs VLM to extract pixel-level semantic embeddings suitable for OVSS. |
li Jiahao; Yanyun Qu; Yuan Xie; Yang Lu; | nips | 2024-10-07 |
| 871 | DeiSAM: Segment Anything with Deictic Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, deep learning approaches cannot reliably interpret such deictic representations due to their lack of reasoning capabilities in complex scenarios. To remedy this issue, we propose DeiSAM — a combination of large pre-trained neural networks with differentiable logic reasoners — for deictic promptable segmentation. |
HIKARU SHINDO et. al. | nips | 2024-10-07 |
| 872 | AdaptDiff: Cross-Modality Domain Adaptation Via Weak Conditional Semantic Diffusion for Retinal Vessel Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, despite its promise, deep learning has many challenges in practice due to its inability to effectively transition to unseen domains, caused by the inherent data distribution shift and the lack of manual annotations to guide domain adaptation. To tackle this problem, we present an unsupervised domain adaptation (UDA) method named AdaptDiff that enables a retinal vessel segmentation network trained on fundus photography (FP) to produce satisfactory results on unseen modalities (e.g., OCT-A) without any manual labels. |
DEWEI HU et. al. | arxiv-cs.CV | 2024-10-06 |
| 873 | Unleashing The Potential of The Diffusion Model in Few-shot Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our initial focus lies in understanding how to facilitate interaction between the query image and the support image, resulting in the proposal of a KV fusion method within the self-attention framework. |
MUZHI ZHU et. al. | arxiv-cs.CV | 2024-10-03 |
| 874 | Annotated Dataset for Training Cloud Segmentation Neural Networks Using High-Resolution Satellite Remote Sensing Imagery Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The integration of satellite data with deep learning has revolutionized various tasks in remote sensing, including classification, object detection, and semantic segmentation. … |
Mingyuan He; Jie Zhang; Yang He; Xinjie Zuo; Zebin Gao; | Remote. Sens. | 2024-10-02 |
| 875 | MFH‐Net: A Hybrid CNN‐Transformer Network Based Multi‐Scale Fusion for Medical Image Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In recent years, U‐Net and its variants have gained widespread use in medical image segmentation. One key aspect of U‐Net’s design is the skip connection, facilitating the … |
Ying Wang; Meng Zhang; Jian’an Liang; Meiyan Liang; | International Journal of Imaging Systems and Technology | 2024-10-02 |
| 876 | Real-Time 3D Visual Perception By Cross-Dimensional Refined Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We introduce a novel learning method that can effectively perceive both the geometry structure and semantic labels of a 3D scene in real time. Existing real-time 3D scene … |
Graduate Student Member Ieee Ziyang Hong; F. I. C. Patrick Yue; | IEEE Transactions on Circuits and Systems for Video … | 2024-10-01 |
| 877 | Multi-Bottleneck Progressive Propulsion Network for Medical Image Semantic Segmentation with Integrated Macro-micro Dual-stage Feature Enhancement and Refinement IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
YUEFEI WANG et. al. | Expert Syst. Appl. | 2024-10-01 |
| 878 | Superpixel-Guided Multi-Type Rail Segmentation Via Contextual Information Aggregation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Vision-based anomaly inspection plays a crucial role in the efficient maintenance of millions of kilometers of railway, with rail segmentation, a key step in such anomaly … |
XUEFENG NI et. al. | IEEE Transactions on Intelligent Transportation Systems | 2024-10-01 |
| 879 | Semantic Segmentation of Unmanned Aerial Vehicle Remote Sensing Images Using SegFormer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper evaluates the effectiveness and efficiency of SegFormer, a semantic segmentation framework, for the semantic segmentation of UAV images. |
Vlatko Spasev; Ivica Dimitrovski; Ivan Chorbev; Ivan Kitanovski; | arxiv-cs.CV | 2024-10-01 |
| 880 | CAFA: Cross-Modal Attentive Feature Alignment for Cross-Domain Urban Scene Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Autonomous driving systems rely heavily on semantic segmentation models for accurate and safe decision-making. High segmentation performance in real-world urban scenes is crucial … |
Peng Liu; Yanqi Ge; Lixin Duan; Wen Li; Fengmao Lv; | IEEE Transactions on Industrial Informatics | 2024-10-01 |
| 881 | Deep Multimodal Fusion for Semantic Segmentation of Remote Sensing Earth Observation Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a late fusion deep learning model (LF-DLM) for semantic segmentation that leverages the complementary strengths of both VHR aerial imagery and SITS. |
Ivica Dimitrovski; Vlatko Spasev; Ivan Kitanovski; | arxiv-cs.CV | 2024-10-01 |
| 882 | Beyond Low-dimensional Features: Enhancing Semi-supervised Medical Image Semantic Segmentation with Advanced Consistency Learning Techniques IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yujie Lu; Wenting Li; Zhongwei Cui; Yongjun Zhang; | Expert Syst. Appl. | 2024-10-01 |
| 883 | I-MedSAM: Implicit Medical Image Segmentation with Segment Anything IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose I-MedSAM, which leverages the benefits of both continuous representations and SAM, to obtain better cross-domain ability and accurate boundary delineation. |
XIAOBAO WEI et. al. | eccv | 2024-09-30 |
| 884 | Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Though adversarial erasing has prevailed in weakly supervised semantic segmentation to help activate integral object regions, existing approaches still suffer from the dilemma of under-activation and over-expansion due to the difficulty in determining when to stop erasing. In this paper, we propose a Knowledge Transfer with Simulated Inter-Image Erasing (KTSE) approach for weakly supervised semantic segmentation to alleviate the above problem. |
TAO CHEN et. al. | eccv | 2024-09-30 |
| 885 | Beyond Pixels: Semi-Supervised Semantic Segmentation with A Multi-scale Patch-based Multi-Label Classifier IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we show that an effective way to incorporate contextual information is through a patch-based classifier. |
Prantik Howlader; Srijan Das; Hieu Le; Dimitris Samaras; | eccv | 2024-09-30 |
| 886 | SegPoint: Segment Any Point Cloud Via Large Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a model, called , that leverages the reasoning capabilities of a multi-modal Large Language Model (LLM) to produce point-wise segmentation masks across a diverse range of tasks: 1) 3D instruction segmentation, 2) 3D referring segmentation, 3) 3D semantic segmentation, and 4) 3D open-vocabulary semantic segmentation.To advance 3D instruction research, we introduce a new benchmark, , designed to evaluate segmentation performance from complex and implicit instructional texts, featuring point cloud-instruction pairs. |
Shuting He; Henghui Ding; Xudong Jiang; Bihan Wen; | eccv | 2024-09-30 |
| 887 | Explore The Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our study delves into the impact of CLIP’s [CLS] token on patch feature correlations, revealing a dominance of ”global” patches that hinders local feature discrimination. To overcome this, we propose CLIPtrase, a novel training-free semantic segmentation strategy that enhances local feature awareness through recalibrated self-correlation among patches. |
Tong Shao; Zhuotao Tian; Hang Zhao; Jingyong Su; | eccv | 2024-09-30 |
| 888 | Open-Vocabulary Camouflaged Object Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To fill in the gaps, we introduce a new task, open-vocabulary camouflaged object segmentation (OVCOS), and construct a large-scale complex scene dataset (OVCamo) containing 11,483 hand-selected images with fine annotations and corresponding object classes. |
Youwei Pang; Xiaoqi Zhao; JiaMing Zuo; Lihe Zhang; Huchuan Lu; | eccv | 2024-09-30 |
| 889 | Dataset Enhancement with Instance-Level Augmentations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present a method for expanding a dataset by incorporating knowledge from the wide distribution of pre-trained latent diffusion models. |
Orest Kupyn; Christian Rupprecht; | eccv | 2024-09-30 |
| 890 | From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a hierarchical transformer-based model designed for sophisticated image segmentation tasks, effectively bridging the granularity of part segmentation with the comprehensive scope of object segmentation. |
Yunfei Xie; Cihang Xie; Alan Yuille; Jieru Mei; | eccv | 2024-09-30 |
| 891 | SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To adapt the VLM from global to local reasoning, we introduce a spatial fine-tuning strategy for label-efficient learning. |
Lukas Hoyer; David Joseph Tan; Muhammad Ferjad Naeem; Luc Van Gool; Federico Tombari; | eccv | 2024-09-30 |
| 892 | VISA: Reasoning Video Object Segmentation Via Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a new task, Reasoning Video Object Segmentation (ReasonVOS). |
CILIN YAN et. al. | eccv | 2024-09-30 |
| 893 | View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we address the challenging task of lifting multi-granular and view-inconsistent image segmentations into a hierarchical and 3D-consistent representation. |
Haodi He; Colton Stearns; Adam Harley; Leonidas Guibas; | eccv | 2024-09-30 |
| 894 | Boosting Gaze Object Prediction Via Pixel-level Supervision from Vision Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents a more challenging gaze object segmentation (GOS) task, which involves inferring the pixel-level mask corresponding to the object captured by human gaze behavior. |
Yang Jin; Lei Zhang; Shi Yan; Bin Fan; Binglu Wang; | eccv | 2024-09-30 |
| 895 | Open Panoramic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To further enhance the distortion-aware modeling ability from the pinhole source domain, we propose a novel data augmentation method called Random Equirectangular Projection (RERP) which is specifically designed to address object deformations in advance. |
JUNWEI ZHENG et. al. | eccv | 2024-09-30 |
| 896 | Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation Without Manual Labels IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast, recent 2D foundation models have demonstrated strong generalization and impressive zero-shot abilities, inspiring us to incorporate these characteristics from 2D models into 3D models. Therefore, we explore the use of image segmentation foundation models to automatically generate high-quality training labels for 3D segmentation models. |
RUI HUANG et. al. | eccv | 2024-09-30 |
| 897 | Ensemble of Semantic Segmentation Models for Oral Epithelial Dysplasia Images IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Early detection of potentially malignant disorders such as oral epithelial dysplasia (OED) is important for preventing oral cancer. Semantic segmentation of nuclei in … |
A. BARBOSA-SILVA et. al. | 2024 37th SIBGRAPI Conference on Graphics, Patterns and … | 2024-09-30 |
| 898 | Open-Vocabulary RGB-Thermal Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Second, when fusing RGB and thermal images, they often need to design complex fusion network structures, which usually results in low network training efficiency. We present OpenRSS, the Open-vocabulary RGB-T Semantic Segmentation method, to solve these two disadvantages. |
GUOQIANG ZHAO et. al. | eccv | 2024-09-30 |
| 899 | SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present , a new data generation approach that pushes the performance boundaries of state-of-the-art image segmentation models. |
HANRONG YE et. al. | eccv | 2024-09-30 |
| 900 | 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose 3DSS-VLG, a weakly supervised approach for 3D Semantic Segmentation with 2D Vision-Language Guidance, an alternative approach that a 3D model predicts dense-embedding for each point which is co-embedded with both the aligned image and text spaces from the 2D vision-language model. |
XIAOXU XU et. al. | eccv | 2024-09-30 |
| 901 | Segment and Recognize Anything at Any Granularity IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce , an augmented image segmentation foundation for segmenting and recognizing anything at desired granularities. |
FENG LI et. al. | eccv | 2024-09-30 |
| 902 | Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a Class-Agnostic Visio-Temporal Network (CAVT) for scene sketch semantic segmentation. |
Aleyna Kütük; Tevfik Metin Sezgin; | arxiv-cs.CV | 2024-09-30 |
| 903 | RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To effectively embed high-dimensional features, we propose a double-nested autoencoder structure with a novel class-aware embedding objective to encode high-dimensional features into manageable voxel-wise embeddings. |
Li Li; Hubert P. H. Shum; Toby P Breckon; | eccv | 2024-09-30 |
| 904 | Occlusion-Aware Seamless Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Panoramic images can broaden the Field of View (FoV), occlusion-aware prediction can deepen the understanding of the scene, and domain adaptation can transfer across viewing domains. In this work, we introduce a novel task, Occlusion-Aware Seamless Segmentation (OASS), which simultaneously tackles all these three challenges. |
YIHONG CAO et. al. | eccv | 2024-09-30 |
| 905 | OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the task, we propose a plug-and-play approach termed OLAF. |
Pranav Gupta; Rishubh Singh; Pradeep Shenoy; Ravi Kiran Sarvadevabhatla; | eccv | 2024-09-30 |
| 906 | O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For the purpose of preserving consistency in 3D object properties across different viewpoints, we propose a spatial adaptive voxel adjustment mechanism and a multi-view weight selection method. |
MUER TIE et. al. | eccv | 2024-09-30 |
| 907 | Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose several problem-specific novel attacks minimizing different metrics in accuracy and mIoU. |
Francesco Croce; Naman D. Singh; Matthias Hein; | eccv | 2024-09-30 |
| 908 | Placing Objects in Context Via Inpainting for Out-of-distribution Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose the Placing Objects in Context (POC) pipeline to realistically add any object into any image via diffusion models. |
Pau de Jorge Aranda; Riccardo Volpi; Puneet Dokania; Philip Torr; Gregory Rogez; | eccv | 2024-09-30 |
| 909 | Can Textual Semantics Mitigate Sounding Object Segmentation Preference? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Motivated by the the fact that text modality is well explored and contains rich abstract semantics, we propose leveraging text cues from the visual scene to enhance audio guidance with the semantics inherent in text. |
Yaoting Wang; Peiwen Sun; Yuanchao Li; Honggang Zhang; Di Hu; | eccv | 2024-09-30 |
| 910 | In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present Lazy Visual Grounding for open-vocabulary semantic segmentation, which decouples unsupervised object mask discovery from object grounding. |
Dahyun Kang; Minsu Cho; | eccv | 2024-09-30 |
| 911 | Enriching Information and Preserving Semantic Congruence in Expanding Curvilinear Object Segmentation Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Curvilinear object segmentation plays a crucial role across various applications, yet datasets in this domain often suffer from small scale due to the high costs associated with data acquisition and annotation. To address these challenges, this paper introduces a novel approach for expanding curvilinear object segmentation datasets, focusing on enhancing the informativeness of generated data and the consistency between semantic maps and generated images. |
Qin Lei; Jiang Zhong; Qizhu Dai; | eccv | 2024-09-30 |
| 912 | One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce VideoLISA, a video-based multimodal large language model designed to tackle the problem of language-instructed reasoning segmentation in videos. |
ZECHEN BAI et. al. | arxiv-cs.CV | 2024-09-29 |
| 913 | Gaussian Heritage: 3D Digitization of Cultural Heritage with Integrated Object Segmentation IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The creation of digital replicas of physical objects has valuable applications for the preservation and dissemination of tangible cultural heritage. However, existing methods are … |
Mahtab Dahaghin; Myrna Castillo; Kourosh Riahidehkordi; M. Toso; A. D. Bue; | ArXiv | 2024-09-27 |
| 914 | Get It For Free: Radar Segmentation Without Expert Labels and Its Application in Odometry and Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a novel weakly supervised semantic segmentation method for radar segmentation, where the existing LiDAR semantic segmentation models are employed to generate semantic labels, which then serve as supervision signals for training a radar semantic segmentation model. |
Siru Li; Ziyang Hong; Yushuai Chen; Liang Hu; Jiahu Qin; | arxiv-cs.RO | 2024-09-26 |
| 915 | Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Go-SLAM, a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments while embedding object-level information within the scene representations. |
Phu Pham; Dipam Patel; Damon Conover; Aniket Bera; | arxiv-cs.RO | 2024-09-25 |
| 916 | Global-Local Medical SAM Adaptor Based on Full Adaption Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, Med-SA still can be improved, as it fine-tunes SAM in a partial adaption manner. To resolve this problem, we present a novel global medical SAM adaptor (GMed-SA) with full adaption, which can adapt SAM globally. |
MENG WANG et. al. | arxiv-cs.AI | 2024-09-25 |
| 917 | PETRFusion: Multi-Sensor Fusion Based BEV Semantic Segmentation Network for Autonomous Driving Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Bird’s Eye View (BEV) based perception algorithms are receiving increasing attention in the field of autonomous driving, how to more effectively utilize multi-sensor data in BEV … |
YANG ZHAO et. al. | 2024 IEEE 27th International Conference on Intelligent … | 2024-09-24 |
| 918 | Potential Field As Scene Affordance for Behavior Change-Based Visual Risk Object Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we compute potential fields by assigning different energy levels according to the semantic labels obtained from BEV semantic segmentation. |
Pang-Yuan Pao; Shu-Wei Lu; Ze-Yan Lu; Yi-Ting Chen; | arxiv-cs.CV | 2024-09-24 |
| 919 | Potential Fields As Scene Affordance for Behavior Change-Based Visual Risk Object Identification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We study behavior change-based visual risk object identification (Visual-ROI), a critical framework designed to detect potential hazards for intelligent driving systems. Existing … |
Pang-Yuan Pao; Shu-Wei Lu; Ze-Yan Lu; Yi-Ting Chen; | 2025 IEEE International Conference on Robotics and … | 2024-09-24 |
| 920 | The BRAVO Semantic Segmentation Challenge Results in UNCV2024 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose the unified BRAVO challenge to benchmark the reliability of semantic segmentation models under realistic perturbations and unknown out-of-distribution (OOD) scenarios. |
TUAN-HUNG VU et. al. | arxiv-cs.CV | 2024-09-23 |
| 921 | Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce adiffusion-based framework to address the RGB-D semantic segmentation problem.Additionally, we demonstrate that utilizing a Deformable Attention Transformeras the encoder to extract features from depth images effectively captures thecharacteristics of invalid regions in depth measurements. |
Minh Bui; Kostas Alexis; | arxiv-cs.CV | 2024-09-23 |
| 922 | ZeroSCD: Zero-Shot Street Scene Change Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional change detection methods rely on training models that take these image pairs as input and estimate the changes, which requires large amounts of annotated data, a costly and time-consuming process. To overcome this, we propose ZeroSCD, a zero-shot scene change detection framework that eliminates the need for training. |
Shyam Sundar Kannan; Byung-Cheol Min; | arxiv-cs.RO | 2024-09-23 |
| 923 | Infield Disease Detection in Citrus Plants: Integrating Semantic Segmentation and Dynamic Deep Learning Object Detection Model for Enhanced Agricultural Yield Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
N. Rani; Arun Sri Krishna; M. Sunag; M. A. Sangamesha; B. R. Pushpa; | Neural Comput. Appl. | 2024-09-21 |
| 924 | MOSE: Monocular Semantic Reconstruction Using NeRF-Lifted Noisy Priors Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose MOSE, a neural field semantic reconstruction approach to lift inferred image-level noisy priors to 3D, producing accurate semantics and geometry in both 3D and 2D space. |
Zhenhua Du; Binbin Xu; Haoyu Zhang; Kai Huo; Shuaifeng Zhi; | arxiv-cs.CV | 2024-09-21 |
| 925 | CUS3D :CLIP-based Unsupervised 3D Segmentation Via Object-level Denoise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, unlike previous research that ignores the “noise” raised during feature projection from 2D to 3D, we propose a novel distillation learning framework named CUS3D. |
Fuyang Yu; Runze Tian; Zhen Wang; Xiaochuan Wang; Xiaohui Liang; | arxiv-cs.CV | 2024-09-20 |
| 926 | A Bottom-Up Approach to Class-Agnostic Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a novel bottom-up formulation for addressing the class-agnostic segmentation problem. |
Sebastian Dille; Ari Blondal; Sylvain Paris; Yağız Aksoy; | arxiv-cs.CV | 2024-09-20 |
| 927 | Learning Scene Semantics From Vehicle-Centric Data for City-Scale Digital Twins Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The creation of digital twins of cityscapes requires the understanding the semantics of relevant objects encountered in the scene, with classes possibly not well covered in … |
HERMANN FÜRNTRATT et. al. | 2024 International Conference on Content-Based Multimedia … | 2024-09-18 |
| 928 | HS3-Bench: A Benchmark and Strong Baseline for Hyperspectral Semantic Segmentation in Driving Scenarios IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Even though some datasets exist, there is no standard benchmark available to systematically measure progress on this task and evaluate the benefit of hyperspectral data. In this paper, we work towards closing this gap by providing the HyperSpectral Semantic Segmentation benchmark (HS3-Bench). |
Nick Theisen; Robin Bartsch; Dietrich Paulus; Peer Neubert; | arxiv-cs.CV | 2024-09-17 |
| 929 | Fuse4Seg: Image-Level Fusion Based Multi-Modality Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We argue the current feature-level fusion strategy is prone to semantic inconsistencies and misalignments across various imaging modalities because it merges features at intermediate layers in a neural network without evaluative control. To mitigate this, we introduce a novel image-level fusion based multi-modality medical image segmentation method, Fuse4Seg, which is a bi-level learning framework designed to model the intertwined dependencies between medical image segmentation and medical image fusion. |
Yuchen Guo; Weifeng Su; | arxiv-cs.CV | 2024-09-16 |
| 930 | Semantic2D: A Semantic Dataset for 2D Lidar Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a 2D lidar semantic segmentation dataset to enhance the semantic scene understanding for mobile robots in different indoor robotics applications. |
Zhanteng Xie; Philip Dames; | arxiv-cs.RO | 2024-09-15 |
| 931 | Weakly Supervised Point Cloud Semantic Segmentation Based on Scene Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Yingchun Niu; Jianqin Yin; Chao Qi; Liang Geng; | Appl. Intell. | 2024-09-14 |
| 932 | Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a method for interpretable semantic segmentation that leverages multi-scale image representation for prototypical part learning. |
Hugo Porta; Emanuele Dalsasso; Diego Marcos; Devis Tuia; | arxiv-cs.CV | 2024-09-14 |
| 933 | A Edge-Guided Satellite Image Semantic Segmentation Method for Real Estate Appraisal Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper proposes an edge-guided satellite image semantic segmentation method for real estate appraisal. The digitization of real estate appraisal has been significantly … |
Yinuo Cui; Yilin He; Fangyuan Zhu; | 2024 3rd International Conference on Artificial … | 2024-09-13 |
| 934 | Lightweight Semantic Segmentation Network for Remote Sensing Urban Scenes Based on Selective Kernel Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: In the context of remote sensing images, semantic segmentation networks must possess robust global information extraction capabilities due to the presence of indistinct object … |
Youwen Fan; | 2024 3rd International Conference on Artificial … | 2024-09-13 |
| 935 | AFFSegNet: Adaptive Feature Fusion Segmentation Network for Microtumors and Multi-Organ Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce an augmented multi-layer perceptron within the encoder to explicitly model long-range dependencies during feature extraction. |
FUCHEN ZHENG et. al. | arxiv-cs.CV | 2024-09-12 |
| 936 | UNIT: Unsupervised Online Instance Segmentation Through Time Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To that end, we leverage an instance segmentation backbone and propose a new training recipe that enables the online tracking of objects. |
Corentin Sautier; Gilles Puy; Alexandre Boulch; Renaud Marlet; Vincent Lepetit; | arxiv-cs.CV | 2024-09-12 |
| 937 | Segmentation By Factorization: Unsupervised Semantic Segmentation for Pathology By Factorizing Foundation Model Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Segmentation by Factorization (F-SEG), an unsupervised segmentation method for pathology that generates segmentation masks from pre-trained deep learning models. |
Jacob Gildenblat; Ofir Hadar; | arxiv-cs.CV | 2024-09-09 |
| 938 | SGSeg: Enabling Text-free Inference in Language-guided Segmentation of Chest X-rays Via Self-guidance IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this study, we propose a self-guided segmentation framework (SGSeg) that leverages language guidance for training (multi-modal) while enabling text-free inference (uni-modal), which is the first that enables text-free inference in language-guided segmentation. |
Shuchang Ye; Mingyuan Meng; Mingjian Li; Dagan Feng; Jinman Kim; | arxiv-cs.CV | 2024-09-07 |
| 939 | ISeg: An Iterative Refinement-based Framework for Training-free Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To fully utilize self-attention map, we present a deep experimental analysis on iteratively refining cross-attention map with self-attention map, and propose an effective iterative refinement framework for training-free segmentation, named iSeg. |
Lin Sun; Jiale Cao; Jin Xie; Fahad Shahbaz Khan; Yanwei Pang; | arxiv-cs.CV | 2024-09-04 |
| 940 | Evaluation Study on SAM 2 for Class-agnostic Instance-level Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Segment Anything Model (SAM) has demonstrated powerful zero-shot segmentation performance in natural scenes. |
Jialun Pei; Zhangjun Zhou; Tiantian Zhang; | arxiv-cs.CV | 2024-09-04 |
| 941 | AllWeatherNet:Unified Image Enhancement for Autonomous Driving Under Adverse Weather and Lowlight-conditions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Existing methods have limited effectiveness in improving essential computer vision tasks, such as semantic segmentation, and often focus on only one specific condition, such as removing rain or translating nighttime images into daytime ones. To address these limitations, we propose a method to improve the visual quality and clarity degraded by such adverse conditions. |
CHENGHAO QIAN et. al. | arxiv-cs.CV | 2024-09-03 |
| 942 | Segmenting Object Affordances: Reproducibility and Sensitivity to Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, experimental setups are often not reproducible, thus leading to unfair and inconsistent comparisons. In this work, we benchmark these methods under a reproducible setup on two single objects scenarios, tabletop without occlusions and hand-held containers, to facilitate future comparisons. |
Tommaso Apicella; Alessio Xompero; Paolo Gastaldo; Andrea Cavallaro; | arxiv-cs.CV | 2024-09-03 |
| 943 | Fast Semantic Segmentation of Ultra-High-Resolution Remote Sensing Images Via Score Map and Fast Transformer-Based Fusion Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: For ultra-high-resolution (UHR) image semantic segmentation, striking a balance between computational efficiency and storage space is a crucial research direction. This paper … |
Yihao Sun; Mingrui Wang; Xiaoyi Huang; Chengshu Xin; Yinan Sun; | Remote. Sens. | 2024-09-02 |
| 944 | Transferring Multi-Modal Domain Knowledge to Uni-Modal Domain for Urban Scene Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Synthetic data (i.e., source domain) have been widely adopted to improve the semantic segmentation performance for real-world images (i.e., target domain), since obtaining … |
PENG LIU et. al. | IEEE Transactions on Intelligent Transportation Systems | 2024-09-01 |
| 945 | Multiresolution Refinement Network for Semantic Segmentation in Internet of Things Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: With the large-scale deployment of the Internet of Things (IoT), the demand for real-time perception and environment understanding in road scenarios is becoming increasingly … |
Dakai Wang; Xiangyang Jiang; Shilong Li; Jianxin Ma; Miaohui Zhang; | IEEE Internet of Things Journal | 2024-09-01 |
| 946 | Multi-source Domain Adaptation for Panoramic Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, these methods struggle to understand the panoramic structure using only real pinhole images and lack real-world scene perception with only synthetic panoramic images. Therefore, in this paper, we propose a new task, Multi-source Domain Adaptation for Panoramic Semantic Segmentation (MSDA4PASS), which leverages both real pinhole and synthetic panoramic images to improve segmentation on unlabeled real panoramic images. |
JING JIANG et. al. | arxiv-cs.CV | 2024-08-29 |
| 947 | DQFormer: Towards Unified LiDAR Panoptic Segmentation with Decoupled Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose decoupling things/stuff queries according to their intrinsic properties for individual decoding and disentangling classification/segmentation to mitigate ambiguity. |
YU YANG et. al. | arxiv-cs.CV | 2024-08-28 |
| 948 | SPNet: Dual-Branch Network with Spatial Supplementary Information for Building and Water Segmentation of Remote Sensing Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is primarily employed to generate accurate prediction labels for each pixel of the input image, and then classify the images according to the generated … |
WENYU ZHAO et. al. | Remote. Sens. | 2024-08-27 |
| 949 | MROVSeg: Breaking The Resolution Curse of Vision-Language Models in Open-Vocabulary Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A typical solution is to employ additional image backbones for high-resolution inputs, but it also introduce significant computation overhead. Therefore, we propose MROVSeg, a multi-resolution training framework for open-vocabulary image segmentation with a single pretrained CLIP backbone, that uses sliding windows to slice the high-resolution input into uniform patches, each matching the input size of the well-trained image encoder. |
YUANBING ZHU et. al. | arxiv-cs.CV | 2024-08-27 |
| 950 | CLIP-SP: Vision-language Model with Adaptive Prompting for Scene Parsing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We present a novel framework, CLIP-SP, and a novel adaptive prompt method to leverage pre-trained knowledge from CLIP for scene parsing. Our approach addresses the limitations of … |
JIAAO LI et. al. | Comput. Vis. Media | 2024-08-27 |
| 951 | Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While model development and validation are primarily conducted on idealistic scenes, geometric domain shifts, such as occlusions of the situs, are common in real-world open surgeries. To close this gap, we (1) present the first analysis of state-of-the-art (SOA) semantic segmentation models when faced with geometric out-of-distribution (OOD) data, and (2) propose an augmentation technique called Organ Transplantation, to enhance generalizability. |
SILVIA SEIDLITZ et. al. | arxiv-cs.CV | 2024-08-27 |
| 952 | Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: Robust semantic segmentation of intraoperative image data holds promise for enabling automatic surgical scene understanding and autonomous robotic surgery. While model development … |
Boqi Chen; Kevin Thandiackal; Pushpak Pati; O. Goksel; | ArXiv | 2024-08-27 |
| 953 | ICFRNet: Image Complexity Prior Guided Feature Refinement for Real-time Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we leverage image complexity as a prior for refining segmentation features to achieve accurate real-time semantic segmentation. |
Xin Zhang; Teodor Boyadzhiev; Jinglei Shi; Jufeng Yang; | arxiv-cs.CV | 2024-08-25 |
| 954 | Accuracy Improvement of Cell Image Segmentation Using Feedback Former Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This tendency leads to a lack of detailed information for segmentation. Therefore, to supplement or reinforce the missing detailed information, we hypothesized that feedback processing in the human visual cortex should be effective. |
Hinako Mitsuoka; Kazuhiro Hotta; | arxiv-cs.CV | 2024-08-23 |
| 955 | Image Segmentation in Foundation Model Era: A Survey IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We investigate two basic lines of research — generic image segmentation (i.e., semantic segmentation, instance segmentation, panoptic segmentation), and promptable image segmentation (i.e., interactive segmentation, referring segmentation, few-shot segmentation) — by delineating their respective task settings, background concepts, and key challenges. |
TIANFEI ZHOU et. al. | arxiv-cs.CV | 2024-08-23 |
| 956 | SIn-NeRF2NeRF: Editing 3D Scenes with Instructions Through Segmentation and Inpainting Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Abstract: TL;DR Perform 3D object editing selectively by disentangling it from the background scene. Instruct-NeRF2NeRF (in2n) is a promising method that enables editing of 3D scenes … |
Jiseung Hong; Changmin Lee; Gyusang Yu; | ArXiv | 2024-08-23 |
| 957 | The 2nd Solution for LSVOS Challenge RVOS Track: Spatial-temporal Refinement for Consistent Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: During testing, while these models can effectively process information over short time steps, they struggle to maintain consistent perception over prolonged time sequences, leading to inconsistencies in the resulting semantic segmentation masks. To address this challenge, we take a step further in this work by leveraging the tracking capabilities of the newly introduced Segment Anything Model version 2 (SAM-v2) to enhance the temporal consistency of the referring object segmentation model. |
Tuyen Tran; | arxiv-cs.CV | 2024-08-22 |
| 958 | Improved Semi-Supervised Attention GAN for Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation is one of the cornerstone problems in computer vision that involves assigning each image pixel to a specific semantic class. Traditional supervised learning … |
Nusrat Jahan; Thangarajah Akilan; Thanh Minh Nguyen; | 2024 IEEE Pacific Rim Conference on Communications, … | 2024-08-21 |
| 959 | Rethinking Video Segmentation with Masked Video Consistency: Did The Model Learn As Intended? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This leads to inconsistent segmentation results across frames. To address these issues, we propose a training strategy Masked Video Consistency, which enhances spatial and temporal feature aggregation. |
Chen Liang; Qiang Guo; Xiaochao Qu; Luoqi Liu; Ting Liu; | arxiv-cs.CV | 2024-08-20 |
| 960 | 3D-Aware Instance Segmentation and Tracking in Egocentric Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person video that leverages 3D awareness to overcome these obstacles. |
YASH BHALGAT et. al. | arxiv-cs.CV | 2024-08-19 |
| 961 | OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce OVOSE, the first Open-Vocabulary Semantic Segmentation algorithm for Event cameras. |
Muhammad Rameez Ur Rahman; Jhony H. Giraldo; Indro Spinelli; Stéphane Lathuilière; Fabio Galasso; | arxiv-cs.CV | 2024-08-18 |
| 962 | Depth-guided Texture Diffusion for Image Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge. |
Wei Sun; Yuan Li; Qixiang Ye; Jianbin Jiao; Yanzhao Zhou; | arxiv-cs.CV | 2024-08-17 |
| 963 | Tuning A SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, a Multi-Cognitive SAM-Based Instance Segmentation Model (MC-SAM SEG) is introduced to employ SAM on remote sensing domain. |
Linghao Zheng; Xinyang Pu; Feng Xu; | arxiv-cs.CV | 2024-08-16 |
| 964 | InstSynth: Instance-wise Prompt-guided Style Masked Conditional Data Synthesis for Scene Understanding Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Scene understanding at the instance level is an essential task in computer vision to support modern Advanced Driver Assistance Systems. Solutions have been proposed with abundant … |
THANH-DANH NGUYEN et. al. | 2024 International Conference on Multimedia Analysis and … | 2024-08-15 |
| 965 | HEFANet: Hierarchical Efficient Fusion and Aggregation Segmentation Network for Enhanced Rgb-thermal Urban Scene Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
ZHENGWEN SHEN et. al. | Appl. Intell. | 2024-08-14 |
| 966 | Enhancing Autonomous Vehicle Perception in Adverse Weather Through Image Augmentation During Semantic Segmentation Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We trained encoder-decoder UNet models to perform semantic segmentation. |
Ethan Kou; Noah Curran; | arxiv-cs.CV | 2024-08-13 |
| 967 | Domain-adapted Polyp Image Semantic Segmentation Utilizing A Generative Adversarial Network and U-Net Framework Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Medical image segmentation is a crucial area in medical image processing, aiming to help doctors accurately identify and segment different structures and tissues in images, … |
Shizhao Ma; Yunhai Gao; Shuquan Feng; Lin Li; Mengyuan Ma; | Proceedings of the 2024 5th International Symposium on … | 2024-08-13 |
| 968 | MacFormer: Semantic Segmentation with Fine Object Boundaries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While Vision Transformer-based models have made significant progress, current semantic segmentation methods often struggle with precise predictions in localized areas like object boundaries. To tackle this challenge, we introduce a new semantic segmentation architecture, “MacFormer”, which features two key components. |
GUOAN XU et. al. | arxiv-cs.CV | 2024-08-11 |
| 969 | TOSS: Real-time Tracking and Moving Object Segmentation for Static Scene Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an integrated real-time framework that combines online tracking-based moving object segmentation with static map building. |
SEOYEON JANG et. al. | arxiv-cs.RO | 2024-08-10 |
| 970 | MEDANet: More Efficient Dual Attention Network for Scene Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Ouyang Pan; Xiaoguo Yao; Zhijian Huang; | J. Circuits Syst. Comput. | 2024-08-09 |
| 971 | A Multiscale Feature Fusion‐guided Lightweight Semantic Segmentation Network Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Semantic segmentation, a task of assigning class labels to each pixel in an image, has found applications in various real‐world scenarios, including autonomous driving and scene … |
Xin Ye; Junchen Pan; Jichen Chen; Jingbo Zhang; | Journal of Field Robotics | 2024-08-08 |
| 972 | Embodied Uncertainty-Aware Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To deal with uncertainty in robot perception, we propose a method for generating a hypothesis distribution of object segmentation. |
Xiaolin Fang; Leslie Pack Kaelbling; Tomás Lozano-Pérez; | arxiv-cs.RO | 2024-08-08 |
| 973 | SegXAL: Explainable Active Learning for Semantic Segmentation in Driving Scene Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, there are certain challenges that hinder the deployment of AI models in-the-wild scenarios, i.e., inefficient use of unlabeled data, lack of incorporation of human expertise, and lack of interpretation of the results. To mitigate these challenges, we propose a novel Explainable Active Learning (XAL) model, XAL-based semantic segmentation model SegXAL, that can (i) effectively utilize the unlabeled data, (ii) facilitate the Human-in-the-loop paradigm, and (iii) augment the model decisions in an interpretable way. |
Sriram Mandalika; Athira Nambiar; | arxiv-cs.CV | 2024-08-08 |
| 974 | Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussians Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: This paper introduces a novel method for open-vocabulary 3D scene querying in autonomous driving by combining Language Embedded 3D Gaussians with Large Language Models (LLMs). We … |
Amirhosein Chahe; Lifeng Zhou; | 2025 IEEE/CVF Winter Conference on Applications of Computer … | 2024-08-07 |
| 975 | Biomedical SAM 2: Segment Anything in Biomedical Images and Videos IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To explore the performance of SAM-2 in biomedical applications, we designed three evaluation pipelines for single-frame 2D image segmentation, multi-frame 3D image segmentation and multi-frame video segmentation with varied prompt designs, revealing SAM-2’s limitations in medical contexts. |
ZHILING YAN et. al. | arxiv-cs.CV | 2024-08-06 |
| 976 | Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussian Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces a novel method for open-vocabulary 3D scene querying in autonomous driving by combining Language Embedded 3D Gaussians with Large Language Models (LLMs). |
Amirhosein Chahe; Lifeng Zhou; | arxiv-cs.CV | 2024-08-06 |
| 977 | Improving Pavement Crack Segmentation Using Attention Mechanism and Self-gated Activation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Image segmentation is crucial in various applications, from autonomous driving, agriculture, and manufacturing to medical imaging and satellite imaging. It helps a computer … |
Nusrat Jahan; T. Akilan; Tharrengini Suresh; | 2024 IEEE Canadian Conference on Electrical and Computer … | 2024-08-06 |
| 978 | Pixel-Level Domain Adaptation: A New Perspective for Enhancing Weakly Supervised Semantic Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we argue that the distribution discrepancy between the discriminative and the non-discriminative parts of objects prevents the model from producing complete and precise pseudo masks as ground truths. |
Ye Du; Zehua Fu; Qingjie Liu; | arxiv-cs.CV | 2024-08-04 |
| 979 | PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional segmentation algorithms falter as they cannot accurately mimic the complexity of UAV perspectives, and the cost of obtaining multi-perspective labeled datasets is prohibitive. To address these issues, we introduce the PPTFormer, a novel Pseudo Multi-Perspective Transformer network that revolutionizes UAV image segmentation. |
Deyi Ji; Wenwei Jin; Hongtao Lu; Feng Zhao; | ijcai | 2024-08-03 |
| 980 | Aggregation and Purification: Dual Enhancement Network for Point Cloud Few-shot Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we design a novel Dual Enhancement Network (DENet) to comprehensively tackle different kinds of scene discrepancies in a coherent and synergistic framework. |
GUOXIN XIONG et. al. | ijcai | 2024-08-03 |
| 981 | Remote Sensing Image Semantic Segmentation Based on Cascaded Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: High-resolution (HR) remote sensing image semantic segmentation plays an important role in Earth’s surface. Despite remote sensing image semantic segmentation methods have … |
Falin Wang; Jian Ji; Yuan Wang; | IEEE Transactions on Artificial Intelligence | 2024-08-01 |
| 982 | Prompt Learning for Light Field Semantic Segmentation in The Consumer-Centric Internet of Intelligent Computing Things Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Light field semantic segmentation accurately identifies the semantic information of the scene, providing solutions for various intelligent computing tasks in consumer electronics … |
CHEN JIA et. al. | IEEE Transactions on Consumer Electronics | 2024-08-01 |
| 983 | Attention Mechanism and Out-of-Distribution Data on Cross Language Image Matching for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Chi-Chia Sun; Jing-Ming Guo; Chen-Hung Chung; Bo-Yu Chen; | IEEE Transactions on Cognitive and Developmental Systems | 2024-08-01 |
| 984 | Multi-unit Stacked Architecture: An Urban Scene Segmentation Network Based on UNet and ShuffleNetv2 Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Dian Liu; Jianchao Du; Chuhan Li; Chenglong Yu; Mingjin Zhang; | Appl. Soft Comput. | 2024-08-01 |
| 985 | Point-supervised Brain Tumor Segmentation with Box-prompted MedSAM Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although recent vision foundational models, such as the medical segment anything model (MedSAM), have made significant advancements in bounding-box-prompted segmentation, it is not straightforward to utilize point annotation, and is prone to semantic ambiguity. In this preliminary study, we introduce an iterative framework to facilitate semantic-aware point-supervised MedSAM. |
Xiaofeng Liu; Jonghye Woo; Chao Ma; Jinsong Ouyang; Georges El Fakhri; | arxiv-cs.CV | 2024-08-01 |
| 986 | Efficient Dual-Stream Fusion Network for Real-Time Railway Scene Understanding IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Railway scene understanding is key to autonomous train operation and important in active train perception. However, most railway scene understanding methods focus on track … |
ZHIWEI CAO et. al. | IEEE Transactions on Intelligent Transportation Systems | 2024-08-01 |
| 987 | MaskUno: Switch-Split Block For Enhancing Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In all the proposed variations to date, the problem of competing kernels (each class aims to maximize its own accuracy) persists when models try to synchronously learn numerous classes. In this paper, we propose mitigating this problem by replacing mask prediction with a Switch-Split block that processes refined ROIs, classifies them, and assigns them to specialized mask predictors. |
Jawad Haidar; Marc Mouawad; Imad Elhajj; Daniel Asmar; | arxiv-cs.CV | 2024-07-31 |
| 988 | Leveraging Adaptive Implicit Representation Mapping for Ultra High-Resolution Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Secondly, SIRMF is shared across all samples, which limits its ability to generalize and handle diverse inputs. To address these limitations, we propose a novel approach that leverages the newly proposed Adaptive Implicit Representation Mapping (AIRM) for ultra-high-resolution Image Segmentation. |
Ziyu Zhao; Xiaoguang Li; Pingping Cai; Canyu Zhang; Song Wang; | arxiv-cs.CV | 2024-07-30 |
| 989 | Learning Ordinality in Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While existing deep learning approaches achieve high accuracy, they often overlook the ordinal relationships between classes, which can provide critical domain knowledge (e.g., the pupil lies within the iris, and lane markings are part of the road). This paper introduces novel methods for spatial ordinal segmentation that explicitly incorporate these inter-class dependencies. |
Ricardo P. M. Cruz; Rafael Cristino; Jaime S. Cardoso; | arxiv-cs.CV | 2024-07-30 |
| 990 | Fine-grained Metrics for Point Cloud Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Because of this, the majority of categories and large objects are favored in the existing evaluation metrics. This paper suggests fine-grained mIoU and mAcc for a more thorough assessment of point cloud segmentation algorithms in order to address these issues. |
Zhuheng Lu; Ting Wu; Yuewei Dai; Weiqing Li; Zhiyong Su; | arxiv-cs.CV | 2024-07-30 |
| 991 | ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The recent Segment Anything Model (SAM) reveals the capability to segment objects following prompts, but the manual annotations for prompts are impractical during the surgery. To address these limitations in operating rooms, we propose an audio-driven surgical instrument segmentation framework, named ASI-Seg, to accurately segment the required surgical instruments by parsing the audio commands of surgeons. |
ZHEN CHEN et. al. | arxiv-cs.CV | 2024-07-28 |
| 992 | SMPISD-MTPNet: Scene Semantic Prior-Assisted Infrared Ship Detection Using Multi-Task Perception Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: For the training process, we introduce the Soft Fine-tuning training strategy to suppress the distortion caused by data augmentation. |
CHEN HU et. al. | arxiv-cs.CV | 2024-07-25 |
| 993 | Deformable Convolution Based Road Scene Semantic Segmentation of Fisheye Images in Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This study investigates the effectiveness of modern Deformable Convolutional Neural Networks (DCNNs) for semantic segmentation tasks, particularly in autonomous driving scenarios with fisheye images. |
ANAM MANZOOR et. al. | arxiv-cs.CV | 2024-07-23 |
| 994 | Disentangling Spatio-temporal Knowledge for Weakly Supervised Object Detection and Segmentation in Surgical Video Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces Video Spatio-Temporal Disentanglement Networks (VDST-Net), a framework to disentangle spatiotemporal information using semi-decoupled knowledge distillation to predict high-quality class activation maps (CAMs). |
Guiqiu Liao; Matjaz Jogan; Sai Koushik; Eric Eaton; Daniel A. Hashimoto; | arxiv-cs.CV | 2024-07-22 |
| 995 | Augmented Efficiency: Reducing Memory Footprint and Accelerating Inference for 3D Semantic Segmentation Through Hybrid Vision Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel approach to 3D semantic segmentation, distinguished by incorporating a hybrid blend of 2D and 3D computer vision techniques, enabling a streamlined, efficient process. |
Aditya Krishnan; Jayneel Vora; Prasant Mohapatra; | arxiv-cs.CV | 2024-07-22 |
| 996 | GaussianBeV: 3D Gaussian Representation Meets Perception Models for BeV Segmentation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose GaussianBeV, a novel method for transforming image features to BeV by finely representing the scene using a set of 3D gaussians located and oriented in 3D space. |
Florian Chabot; Nicolas Granger; Guillaume Lapouge; | arxiv-cs.CV | 2024-07-19 |
| 997 | Panoptic Segmentation of Mammograms with Text-To-Image Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We aim to harness their capabilities for breast lesion segmentation in a panoptic setting, which encompasses both semantic and instance-level predictions. |
Kun Zhao; Jakub Prokop; Javier Montalt Tordera; Sadegh Mohammadi; | arxiv-cs.CV | 2024-07-19 |
| 998 | ViLLa: Video Reasoning Segmentation with Large Language Model IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To bridge the gap between image and video, in this work, we propose a new video segmentation task – video reasoning segmentation. |
RONGKUN ZHENG et. al. | arxiv-cs.CV | 2024-07-18 |
| 999 | OE-BevSeg: An Object Informed and Environment Aware Multimodal Framework for Bird’s-Eye-View Vehicle Semantic Segmentation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Bird’s-eye-view (BEV) semantic segmentation is becoming crucial in autonomous driving systems. It realizes ego-vehicle surrounding environment perception by projecting 2D … |
JIAN SUN et. al. | IEEE Transactions on Intelligent Transportation Systems | 2024-07-18 |
| 1000 | MeshSegmenter: Zero-Shot Mesh Semantic Segmentation Via Texture Synthesis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present MeshSegmenter, a simple yet effective framework designed for zero-shot 3D semantic segmentation. |
ZIMING ZHONG et. al. | arxiv-cs.CV | 2024-07-18 |