Paper Digest: ICDE 2024 Papers & Highlights
Interested users can choose to read all ICDE-2024 papers in our digest console, which supports more features.
To search for papers presented at ICDE-2024 on a specific topic, please make use of the search by venue (ICDE-2024) service. To summarize the latest research published at ICDE-2024 on a specific topic, you can utilize the review by venue (ICDE-2024) service. To synthesizes the findings from ICDE 2024 into comprehensive reports, give a try to ICDE-2024 Research. If you are interested in browsing papers by author, we have a comprehensive list of all ICDE-2024 authors & their papers.
This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive updates on the latest research in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.
Experience the full potential of our services today!
TABLE 1: Paper Digest: ICDE 2024 Papers & Highlights
| Paper | Author(s) | |
|---|---|---|
| 1 | DeepMapping: Learned Data Mapping for Lossless Compression and Efficient Lookup Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we argue and show that a novel DeepMapping abstraction, which relies on the impressive memorization capabilities of deep neural networks, can provide better storage cost, better latency, and better run-time memory footprint, all at the same time. |
Lixi Zhou; K. Selçuk Candan; Jia Zou; |
| 2 | PURPLE: Making A Large Language Model A Better SQL Writer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PURPLE (Pre-trained models Utilized to Retrieve Prompts for Logical Enhancement), which improves accuracy by retrieving demonstrations containing the requisite logical operator composition for the NL2SQL task on hand, thereby guiding LLMs to produce better SQL translation. |
Tonghui Ren; Yuankai Fan; Zhenying He; Ren Huang; Jiaqi Dai; Can Huang; Yinan Jing; Kai Zhang; Yifan Yang; X. Sean Wang; |
| 3 | LeaderKV: Improving Read Performance of KV Stores Via Learned Index and Decoupled KV Table Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents LeaderKV, a read-optimized LSM-tree-based KV store. |
Yi Wang; Jianan Yuan; Shangyu Wu; Huan Liu; Jiaxian Chen; Chenlin Ma; Jianbin Qin; |
| 4 | TRAP: Tailored Robustness Assessment for Index Advisors Via Adversarial Perturbation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses the challenges of assessing the index advisor’s robustness from the following aspects. |
Wei Zhou; Chen Lin; Xuanhe Zhou; Guoliang Li; Tianqing Wang; |
| 5 | Duet: Efficient and Scalable Hybrid Neural Relation Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although both data-driven and hybrid methods are proposed to avoid this problem, most of them suffer from high training and estimation costs, limited scalability, instability, and long-tail distribution problems on high-dimensional tables, which seriously affects the practical application of learned cardinality estimators. In this paper, we prove that most of these problems are directly caused by the widely used progressive sampling. |
Kaixin Zhang; Hongzhi Wang; Yabin Lu; Ziqi Li; Chang Shu; Yu Yan; Donghua Yang; |
| 6 | Knowledge Graph Enhanced Multimodal Transformer for Image-Text Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, instead of directly fusing two cross-modal het-erogeneous spaces, we propose an multimodal knowledge enhanced multimodal transformer network framework to combine coarse-grained and fine-grained representation learning into a unified framework, capturing alignment information between targets, constructing a global semantic graph, and ultimately align multimodal representations in the semantic space. |
Juncheng Zheng; Meiyu Liang; Yang Yu; Yawen Li; Zhe Xue; |
| 7 | Functionality-Aware Database Tuning Via Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, if a functionality is not running in the tuning phase, its knobs irrelevant to performance changes can also be tuned by existing tools and potential risks would be introduced. To resolve this problem, we design a database knob tuning framework to support functionality-aware knobs tuning. |
Zhongwei Yue; Shujian Peng; Peng Cai; Xuan Zhou; Huiqi Hu; Rong Zhang; Quanqing Xu; Chuanhui Yang; |
| 8 | Costream: Learned Cost Models for Operator Placement in Edge-Cloud Environments Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present Costream, a novel learned cost model for Distributed Stream Processing Systems that provides accurate predictions of the execution costs of a streaming query in an edge-cloud environment. |
Roman Heinrich; Carsten Binnig; Harald Kornmayer; Manisha Luthra; |
| 9 | SiloFuse: Cross-silo Synthetic Data Generation with Latent Tabular Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SiloFuse, a novel generative framework for high-quality synthesis from cross-silo tabular data. |
Aditya Shankar; Hans Brouwer; Rihan Hai; Lydia Chen; |
| 10 | Explainable Database Management System Configuration Tuning Through Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the lack of explainability in machine learning poses a significant obstacle in quantifying the impact of individual knobs on the database performance. Affected by the above factors, this paper proposes CFTune, a method that can accurately evaluate the performance of a DBMS under each configuration using experience and achieve automatic DBMS configuration tuning with counterfactual techniques. |
Xinyue Shao; Hongzhi Wang; Xiao Zhu; Tianyu Mu; Yan Zhang; |
| 11 | In Situ Neural Relational Schema Matcher Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ISResMat, a framework specifically designed to match the schemas of relational tables by fine-tuning a pre-trained language model. |
Xingyu Du; Gongsheng Yuan; Sai Wu; Gang Chen; Peng Lu; |
| 12 | Towards Exploratory Query Optimization for Template-Based SQL Workloads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel QEP optimization approach for template-based SQL that aims to minimize the execution cost of an entire workload instead of individual queries. |
Jieming Feng; Zhanhuai Li; Qun Chen; |
| 13 | SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our insight is that the common architecture of serverless systems prompts the concentration of certain invocation patterns, leading to predictable invocation behaviors. |
Cheryl Lee; Zhouruixin Zhu; Tianyi Yang; Yintong Huo; Yuxin Su; Pinjia He; Michael R. Lyu; |
| 14 | KGLiDS: A Platform for Semantic Abstraction, Linking, and Automation of Data Science Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Hence, this paper presents a scalable platform, KGLiDS, that employs machine learning and knowledge graph technologies to abstract and capture the semantics of data science artifacts and their connections. |
Mossad Helali; Niki Monjazeb; Shubham Vashisth; Philippe Carrier; Ahmed Helal; Antonio Cavalcante; Khaled Ammar; Katja Hose; Essam Mansour; |
| 15 | Efficiently Estimating Mutual Information Between Attributes Across Tables Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new sketching method that enables efficient evaluation of relationship discovery queries by estimating MI without materializing the joins and returning a smaller set of tables that are more likely to be relevant. |
Aécio Santos; Flip Korn; Juliana Freire; |
| 16 | Effective Entry-Wise Flow for Molecule Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we delve into the intricate entry-wise modules in vanilla flows, introducing an effective variation of flow-based models. |
Qifan Zhang; Junjie Yao; Yuquan Yang; Yizhou Shi; Wei Gao; Xiaoling Wang; |
| 17 | HYPPO: Using Equivalences to Optimize Pipelines in Exploratory Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present HYPPO, a novel system to optimize pipelines encountered in exploratory machine learning. |
Antonios Kontaxakis; Dimitris Sacharidis; Alkis Simitsis; Alberto Abelló; Sergi Nadal; |
| 18 | HITSnDIFFs: From Truth Discovery to Ability Discovery By Recovering Matrices with The Consecutive Ones Property Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyze a general problem in a crowd-sourced setting where one user asks a question (also called item) and other users return answers (also called labels) for this question. |
Zixuan Chen; Subhodeep Mitra; R. Ravi; Wolfgang Gatterbauer; |
| 19 | Cross-Domain-Aware Worker Selection with Training for Crowdsourced Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider both factors in designing an allocation scheme named cross-domain-aware worker selection with training approach. |
Yushi Sun; Jiachuan Wang; Peng Cheng; Libin Zheng; Lei Chen; Jian Yin; |
| 20 | Graph Contrastive Learning for Truth Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches rely heavily on hand-engineered assumptions or ground truth data, limiting their applicability. To address this, we propose GOVERN, a graph contrastive learning framework for truth inference without such external supervision. |
Hao Liu; Jiacheng Liu; Feilong Tang; Peng Li; Long Chen; Jiadi Yu; Yanmin Zhu; Min Gao; Yanqin Yang; Xiaofeng Hou; |
| 21 | Task Recommendation in Spatial Crowdsourcing: A Trade-Off Between Diversity and Coverage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, achieving the highest preference-based utility of workers in most of the existing task recommendation studies is inferior to the benefits of the SC platform and the satisfaction of workers in a long range, due to the lower task coverage rate and the poor diversity in a worker’s recommended list. To address these problems, we propose a Diversity-Coverage Balanced Task Recommendation (DCBTaskRec) framework. |
Liwei Deng; Yan Zhao; Yue Cui; Yuyang Xia; Jin Chen; Kai Zheng; |
| 22 | MACRO: Incentivizing Multi-Leader Game-Based Pareto-Efficient Crowdsourcing for Video Analytics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For maximum profits, platforms carefully choose the workers and determine the video analytics configurations to ensure accuracy; meanwhile, workers possess the flexibility to tailor the configurations for their indivi-dual gains, which makes it hard for platforms to optimize their profits considering the platform-worker conflicts. In this paper, we design an incentive mechanism for Multi-leader game-based video Analytics upon CROwdsourcing, named MACRO, to over-come the above situation. |
Yu Chen; Sheng Zhang; Ziying Zhou; Xiaokun Wang; Yu Liang; Ning Chen; Yuting Yan; Mingjun Xiao; Jie Wu; Zhuzhong Qian; Harry Xu; |
| 23 | Cooperative Global Path Planning for Multiple Platforms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fortunately, with the rise of data sharing and cross-platform cooperation, the data silos between different platforms are gradually being broken. Based on this, we propose Cooperative Global Path Planning (CG PP) framework to over-come the above shortcoming. |
Xiaoxi Cui; Yurong Cheng; Siyi Zhang; Ye Yuan; Guoren Wang; |
| 24 | Cross Online Assignment of Hybrid Task in Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To solve HyTAO effectively, we utilize a cross-platform cooperation model to tackle the challenge of non-uniform distribution. |
Zhao Liu; Guoqing Xiao; Xu Zhou; Yunchuan Qin; Yunjun Gao; Kenli Li; |
| 25 | RA3: A Human-in-the-loop Framework for Interpreting and Improving Image Captioning with Relation-Aware Attribution Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present RA 3 (Relation-Aware Attribution Analysis), a human-in-the-loop framework, for improving the interpretability, and further boosting the performance of the image captioning model. |
Lei Chai; Lu Qi; Hailong Sun; Jingzheng Li; |
| 26 | Efficient Example-Guided Interactive Graph Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In theory aspect, we propose the Target-Sensitive IGS (TS-IGS) algorithm that achieves a query cost complexity of $O(\log n. \log\frac{L}{\log n}+d\cdot\log_{d}n)$, where $L$ is the length of the path from the root of $H$ to the target concept. |
Zhuowei Zhao; Junhao Gan; Jianzhong Qi; Zhifeng Bao; |
| 27 | Wait to Be Faster: A Smart Pooling Framework for Dynamic Ridesharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address a NP-hard ridesharing problem, called Minimal Extra Time RideSharing (METRS), which balances waiting time and group quality (i.e., detour time) to improve riders’ satisfaction. |
Xiaoyao Zhong; Jiabao Jin; Peng Cheng; Wangze Ni; Libin Zheng; Lei Chen; Xuemin Lin; |
| 28 | Adaptive Recursive Query Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Adaptive Metaprogramming, an innovative technique that shifts recursive query optimization and code generation from compile-time to runtime using principled metaprogramming, enabling dynamic optimization and re-optimization before and after query execution has begun. |
Anna Herlihy; Guillaume Martres; Anastasia Ailamaki; Martin Odersky; |
| 29 | Ontology-Mediated Query Answering Using Graph Patterns with Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an extension of graph patterns, referred to as ontological graph patterns (OGPs), to accelerate ontology-mediated query answering. |
Ping Lu; Ting Deng; Haoyuan Zhang; Yufeng Jin; Feiyi Liu; Tiancheng Mao; Lexiao Liu; |
| 30 | An Efficient Algorithm for Continuous Complex Event Matching Using Bit-Parallelism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To support nonconsecutive event matching, it has to maintain a large number of partial matches and skip irrelevant events, which results in a huge overhead. To avoid this problem, we employ the bit parallelism technique to match complex events continuously in this paper. |
Tao Qiu; Shenwang Jiang; Xiaochun Yang; Bin Wang; Chuanyu Zong; Rui Zhu; |
| 31 | Personalized PageRanks Over Dynamic Graphs – The Case for Optimizing Quality of Service Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of Quality-of-Service (QoS)-Aware Personalized PageRank (PPR) computation. |
Zulun Zhu; Siqiang Luo; Wenqing Lin; Sibo Wang; Dingheng Mo; Chunbo Li; |
| 32 | PyTond: Efficient Python Data Science on The Shoulders of Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present PyTond, an efficient approach to push the processing of data science workloads down into the database engines that are already known for their big data handling capabilities. |
Hesam Shahrokhi; Amirali Kaboli; Mahdi Ghorbani; Amir Shaikhha; |
| 33 | Efficient Fault Tolerance for Pipelined Query Engines Via Write-ahead Lineage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present write-ahead lineage, a novel fault recovery technique that combines Spark’s lineage-based replay and write-ahead logging. |
Ziheng Wang; Alex Aiken; |
| 34 | Independent Range Sampling on Interval Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We therefore address the problem of independent range sampling on interval data, which outputs $s$ random samples that overlap a given query interval and are independent of the samples of all previous queries. To efficiently solve this problem theoretically and practically, we propose a variant of an interval tree, namely the augmented interval tree (or AIT), and we show that there exists an exact algorithm that needs $O(n\log n)$ space and $O(\log^{2}n+s)$ time, where $n$ is the dataset size. |
Daichi Amagata; |
| 35 | Incremental Fusion: Unifying Compiled and Vectorized Query Execution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Incremental Fusion, a novel execution paradigm for modern, high-performance query engines. |
Benjamin Wagner; André Kohn; Peter Boncz; Viktor Leis; |
| 36 | The Indistinguishability Query Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the indistinguishability query for iden-tifying all of a user’s near-optimal tuples. |
Ashwin Lall; |
| 37 | Range Cache: An Efficient Cache Component for Accelerating Range Queries on LSM – Based Key-Value Stores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, LSM-tree suffers from the block-cache invalidation problem caused by periodical compaction operations, which lowers the efficiency of the block cache and leads to poor read performance, especially for range queries. To address this problem, we propose a novel cache component named Range Cache to accelerate range queries on LSM-based key-value stores. |
Xiaoliang Wang; Peiquan Jin; Yongping Luo; Zhaole Chu; |
| 38 | Optimizing Context-Enhanced Relational Joins Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, representation learning models can map context-rich data into embeddings, enabling machine-automated context processing but requiring imperative data transformation integration with the analytical query. We present a context-enhanced relational join operator to bridge this dichotomy and introduce an embedding operator composable with relational operators. |
Viktor Sanca; Manos Chatzakis; Anastasia Ailamaki; |
| 39 | IndeXY: A Framework for Constructing Indexes Larger Than Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a memory-disk-spanning index design, named IndeXY, to effectively address the challenges. |
Chen Zhong; Qingqing Zhou; Yuxing Chen; Xingsheng Zhao; Kuang He; Anqun Pan; Song Jiang; |
| 40 | Are ID Embeddings Necessary? Whitening Pre-trained Text Embeddings for Effective Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We also demonstrate that this anisotropic nature hinders recommendation models from effectively differentiating between item representations and leads to degenerated performance. To address this issue, we propose to employ a pre-processing step known as whitening transformation, which transforms the anisotropic text feature distribution into an isotropic Gaussian distribution. |
Lingzi Zhang; Xin Zhou; Zhiwei Zeng; Zhiqi Shen; |
| 41 | Structure- and Logic-Aware Heterogeneous Graph Learning for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel structure- and logic-aware heterogeneous graph learning framework for recommender systems (SLHRec). |
Anchen Li; Bo Yang; Huan Huo; Farookh Khadeer Hussain; Guandong Xu; |
| 42 | Graph Augmentation for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Secondly, many existing GCL approaches rely on graph neural network (GNN) architectures, which can suffer from over-smoothing problems due to non-adaptive message passing. To address these challenges, we propose a principled framework called GraphAug. |
Qianru Zhang; Lianghao Xia; Xuheng Cai; Siu-Ming Yiu; Chao Huang; Christian S. Jensen; |
| 43 | From Chaos to Clarity: Time Series Anomaly Detection in Astronomical Observations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To overcome the challenges, we propose AERO, a novel two-stage framework tailored for unsupervised anomaly detection in astronomical observations. |
Xinli Hao; Yile Chen; Chen Yang; Zhihui Du; Chaohong Ma; Chao Wu; Xiaofeng Meng; |
| 44 | Enhancing Topic Interpretability for Neural Topic Modeling Through Topic-Wise Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Overemphasizing likelihood maximization without incorporating topic regularization can lead to an overly expansive latent space for topic modeling. In this paper, we present an innovative approach to NTMs that addresses this misalignment by introducing contrastive learning measures to assess topic interpretability. |
Xin Gao; Yang Lin; Ruiqing Li; Yasha Wang; Xu Chu; Xinyu Ma; Hailong Yu; |
| 45 | Improve ROI with Causal Learning and Conformal Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a robust Direct ROI Prediction (rDRP) method, designed to address challenges in real-world deployment of neural network-based uplift models, particularly under conditions of covariate shift and insufficient training data. |
Meng Ai; Zhuo Chen; Jibin Wang; Jing Shang; Tao Tao; Zhen Li; |
| 46 | Hide Your Model: A Parameter Transmission-free Federated Recommender System Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With the model size increasing, the communication burden will be the bottleneck for such traditional FedRecs. Given the above limitations, this paper introduces a novel parameter transmission-free federated recommendation framework that balances the protection between users’ data privacy and platforms’ model privacy, namely PTF-FedRec. |
Wei Yuan; Chaoqun Yang; Liang Qu; Quoc Viet Hung Nguyen; Jianxin Li; Hongzhi Yin; |
| 47 | TimeDRL: Disentangled Representation Learning for Multivariate Time-Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent studies in self-supervised learning have shown their potential in learning rich representations without relying on labels, yet they fall short in learning disentangled embeddings and addressing issues of inductive bias (e.g., transformation-invariance). To tackle these challenges, we propose TimeDRL, a generic multivariate time-series representation learning frame-work with disentangled dual-level embeddings. |
Ching Chang; Chiao-Tung Chan; Wei-Yao Wang; Wen-Chih Peng; Tien-Fu Chen; |
| 48 | Boosting Meaningful Dependency Mining with Clustering and Covariance Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods can hardly discover practical and interpretable FDs, especially in large noisy real-life datasets. This paper studies the problem of discovering meaningful functional dependencies (FDms) that utilize support and error parameters to capture interesting dependencies in such datasets and proposes an efficient discovery algorithm called FDMε. |
Xi Wang; Ruochun Jin; Wanrong Huang; Yuhua Tang; |
| 49 | Uncovering The Propensity Identification Problem in Debiased Recommendations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle this research gap, we propose to disentangle user and item embeddings into the primary latent vector for rating prediction and the auxiliary latent vector for missing mechanism modeling. |
Honglei Zhang; Shuyi Wang; Haoxuan Li; Chunyuan Zheng; Xu Chen; Li Liu; Shanshan Luo; Peng Wu; |
| 50 | Scaling Up Multivariate Time Series Pre-Training with Decoupled Spatial-Temporal Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel Decoupled Spatial-Temporal Representation Learning (DeSTR) framework to serve as the backbone network for investigating the data scaling capability of multivariate time series pre-training architectures. |
Rui Zha; Le Zhang; Shuangli Li; Jingbo Zhou; Tong Xu; Hui Xiong; Enhong Chen; |
| 51 | Meta-optimized Structural and Semantic Contrastive Learning for Graph Collaborative Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing data augmentation or noise perturbation may destroy the structural and semantic features of the original data and node attribute information is not considered. To tackle the above limitations, we propose a Meta-optimized Structure and Semantic Contrastive Learning for Graph Collaborative Filtering, named Meta-SSCL, which utilizes graph structure information and semantic information contrastive learning for recommendation. |
Yongjing Hao; Pengpeng Zhao; Jianfeng Qu; Lei Zhao; Guanfeng Liu; Fuzhen Zhuang; Victor S. Sheng; Xiaofang Zhou; |
| 52 | Efficient Set-Based Order Dependency Discovery with A Level-Wise Hybrid Strategy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we investigate the problem of set-based OD discovery, for automatically finding hidden ODs from data. |
Yihan Li; Ruifeng Li; Zijing Tan; Weidong Yang; Shuai Ma; |
| 53 | Meta-Optimized Joint Generative and Contrastive Learning for Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Meta-optimized Seq2Seq Generator and Contrastive Learning (Meta-SGCL) for sequential recommendation, which applies the meta-optimized two-step training strategy to adaptive generate contrastive views. |
Yongjing Hao; Pengpeng Zhao; Junhua Fang; Jianfeng Qu; Guanfeng Liu; Fuzhen Zhuang; Victor S. Sheng; Xiaofang Zhou; |
| 54 | Multi-Modal Siamese Network for Few-Shot Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The most relevant FKGC study simply concatenates various modal features, but the performance is still limited due to the following problems: (1) lack of exploiting significant multi-modal features in neighborhoods, and (2) ineffectively modeling inter-modal interactions in a few-shot setting. To tackle these problems, we propose a novel relational learning model entitled MMSN (Multi-Modal Siamese Network) for few-shot knowledge graph completion, which is composed of the following two primary modules: the Siamese multi-modal neighbor encoder (SMNE) and the meta-learning multi-modal knowledge representation decoder (MKRD). |
Yuyang Wei; Wei Chen; Xiaofang Zhang; Pengpeng Zhao; Jianfeng Qu; Lei Zhao; |
| 55 | Local-Global History-Aware Contrastive Learning for Temporal Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a novel Local-global history-aware Contrastive Learning model (LogCL) for TKG reasoning, which adopts contrastive learning to better guide the fusion of local and global historical information and enhance the ability to resist interference. |
Wei Chen; Huaiyu Wan; Yuting Wu; Shuyuan Zhao; Jiayaqi Cheng; Yuxin Li; Youfang Lin; |
| 56 | Learning Multi-Pattern Normalities in The Frequency Domain for Efficient Time Series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose MACE, a multi-normal-pattern accommodated and efficient anomaly detection method in the frequency domain for time series anomaly detection. |
Feiyi Chen; Yingying Zhang; Zhen Qin; Lunting Fan; Renhe Jiang; Yuxuan Liang; Qingsong Wen; Shuiguang Deng; |
| 57 | Modeling User Attention in Music Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we naturally propose modeling user attention prediction as a positive-unlabeled (PU) learning problem, where active feedback is treated as positive samples and passive feedback is treated as unlabeled samples, as we can only ensure that the user’s attention is focused when she provides active feedback. |
Sunhao Dai; Ninglu Shao; Jieming Zhu; Xiao Zhang; Zhenhua Dong; Jun Xu; Quanyu Dai; Ji-Rong Wen; |
| 58 | A Robust Prioritized Anomaly Detection When Not All Anomalies Are of Primary Interest Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we introduce a novel semi-supervised model, called TargAD, which leverages a few labeled target anomalies, along with potential non-target anomaly candidates and normal candidates selected from unlabeled data. |
Guanyu Lu; Fang Zhou; Martin Pavlovski; Chenyi Zhou; Cheqing Jin; |
| 59 | Enhancing Quantitative Reasoning Skills of Large Language Models Through Dimension Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we present a framework to enhance the quantitative reasoning ability of language models based on dimension perception. |
Yuncheng Huang; Qianyu He; Jiaqing Liang; Sihang Jiang; Yanghua Xiao; Yunwen Chen; |
| 60 | SSDRec: Self-Augmented Sequence Denoising for Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve reliability, we propose to augment sequences by inserting items before denoising. |
Chi Zhang; Qilong Han; Rui Chen; Xiangyu Zhao; Peng Tang; Hongtao Song; |
| 61 | BSL: Understanding and Improving Softmax Loss for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Toward addressing this research gap, we conduct theoretical analyses on SL and uncover three insights: 1) Optimizing SL is equivalent to performing Distributionally Robust Optimization (DRO) on the negative data, thereby learning against perturbations on the negative distribution and yielding robustness to noisy negatives. |
Junkang Wu; Jiawei Chen; Jiancan Wu; Wentao Shi; Jizhi Zhang; Xiang Wang; |
| 62 | Online Detection of Outstanding Quantiles with QuantileFilter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose QuantileFilter, the first approximate algorithm specifically designed for detecting quantile-outstanding keys. |
Yuhan Wu; Aomufei Yuan; Zhouran Shi; Yuanpeng Li; Yikai Zhao; Peiqing Chen; Tong Yang; Bin Cui; |
| 63 | When Multi-Behavior Meets Multi-Interest: Multi-Behavior Sequential Recommendation with Multi-Interest Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach called Multi-Interest Self-Supervised Learning (MISSL) that precisely unifies multi-behavior and multi-interest modeling to obtain more comprehensive and accurate user profiles. |
Binquan Wu; Yu Cheng; Haitao Yuan; Qianli Ma; |
| 64 | E2GCL: Efficient and Expressive Contrastive Learning on Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient and expressive contrastive learning framework for GNNs, namely E2GCL. |
Haoyang Li; Shimin Di; Lei Chen; Xiaofang Zhou; |
| 65 | Ambiguous Entity Oriented Targeted Document Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, in this paper, we explore a new task of targeted document detection, which aims to detect those targeted documents (i.e., documents really mentioning the target entity) from the given candidate documents each of which contains an ambiguous name of the target entity. |
Wei Shen; Haixu Wen; |
| 66 | TS3Net: Triple Decomposition with Spectrum Gradient for Long-Term Time Series Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike conventional time series decomposition that decouples a series into the trend and seasonal parts, we proposed a novel triple decomposition method to decouple a long-term series into three components: trend-part, regular-part, and fluctuant-part. |
Xiangkai Ma; Xiaobin Hong; Sanglu Lu; Wenzhong Li; |
| 67 | An Efficient Fuzzy Stream Clustering Method Based on Granular-Ball Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, most data stream clustering algorithms struggle to effectively deal with the problem of cluster boundary overlap caused by the concept drift. To tackle these challenges, we use a granular-ball structure for the coarse-grained representation of data stream. |
Jiang Xie; Minggao Dai; Shuvin Xia; Jinajinz Zhang; Guoyin Wang; Xinbo Gao; |
| 68 | W-GBC: An Adaptive Weighted Clustering Method Based on Granular-Ball Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The fundamental issue lies in the fact that most weighted clustering methods derive feature weights through global iterations. To address this challenge, this paper introduces a novel weighted granular-ball structure, continually optimizing weights during the ball splitting process and restricting the calculation of local data point weights to the corresponding weighted granular-ball. |
Jiang Xie; Chunfeng Hua; Shuyin Xia; Yuxin Cheng; Guoyin Wang; Xinbo Gao; |
| 69 | RobFL: Robust Federated Learning Via Feature Center Separation and Malicious Center Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the robustness of existing federated learning systems, we propose a novel framework called RobFL. |
Ting Zhou; Ning Liu; Bo Song; Hongtao Lv; Deke Guo; Lei Liu; |
| 70 | Towards Task-Conflicts Momentum-Calibrated Approach for Multi-task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct an in-depth empirical investigation into the potential sources of performance degradation of MTL and find that task gradient conflict is one of the primary reasons for the performance degradation of tasks. |
Heyan Chai; Zeyu Liu; Yongxin Tong; Ziyi Yao; Binxing Fang; Qing Liao; |
| 71 | Hybrid Evaluation for Occlusion-based Explanations on CNN Inference Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are oblivious that incremental evaluation does not always outperform full evaluation for certain layers. To address this issue, we propose a hybrid evaluation to efficiently interleave full and incremental evaluations during the CNN inference. |
Guangyao Ding; Chen Xu; Weining Qian; |
| 72 | Enhancing The Performance of Bandit-based Hyperparameter Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When confronted with numerous configurations and high-dimensional large problems, existing bandit-based methods face challenges of high evaluation cost and poor optimization performance. To address these challenges, we introduce an improved bandit-based approach that exhibits enhanced evaluation ability and is suitable for situations with limited resources. |
Yile Chen; Zeyi Wen; Jian Chen; Jin Huang; |
| 73 | Unraveling The ‘Anomaly’ in Time Series Anomaly Detection: A Self-supervised Tri-domain Solution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This problem is exacerbated by an ill-posed evaluation metric, known as point adjustment (PA), which results in inflated model performance. In this context, we propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD), which addresses these challenges by modeling features across three aspects – temporal, frequency, and residual domains – without relying on anomaly labels. |
Yuting Sun; Guansong Pang; Guanhua Ye; Tong Chen; Xia Hu; Hongzhi Yin; |
| 74 | A Robust Low-Rank Tensor Decomposition and Quantization Based Compression Method Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the compression ratio provided by Tucker decomposition is often insufficient, particularly for large-scale tensors. To tackle this issue, we propose a robust low-rank tensor compression method that leverages Tucker decomposition with quantization and coding. |
Yudian Ouyang; Kun Xie; Jigang Wen; Gaogang Xie; Kenli Li; |
| 75 | A Coarse-to-Fine Framework for Entity-Relation Joint Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel coarse-to-fine extraction framework, which first extracts high-potential relations as well as entities via knowledge distillation, and then rechecks the predictions via handcrafted natural language inference (NLI) task in a fine-grained manner. |
Mingchen Zhang; Jiaan Wang; Jianfeng Qu; Zhixu Li; An Liu; Lei Zhao; Zhigang Chen; Xiaofang Zhou; |
| 76 | KGLink: A Column Type Annotation Method That Combines Knowledge Graph and Pre-Trained Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents KGLink, a method that combines Wiki-Data KG information with a pre-trained deep learning language model for table column annotation, effectively addressing both type granularity and valuable context missing issues. |
Yubo Wang; Hao Xin; Lei Chen; |
| 77 | Learning K-Determinantal Point Processes for Personalized Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a new optimization criterion LkP based on set probability comparison for personalized ranking that moves beyond traditional ranking-based methods. |
Yuli Liu; Christian Walder; Lexing Xie; |
| 78 | A Unified Replay-Based Continuous Learning Framework for Spatio-Temporal Prediction on Streaming Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable spatio-temporal prediction on streaming data, we propose a unified replay- based continuous learning framework. |
Hao Miao; Yan Zhao; Chenjuan Guo; Bin Yang; Kai Zheng; Feiteng Huang; Jiandong Xie; Christian S. Jensen; |
| 79 | Representation Learning of Tangled Key-Value Sequence Data for Early Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The goal is to classify each individual key-value sequence sharing a same key both accurately and early. To address this problem, we propose a novel method, i.e., Key-Value sequence Early Co-classification (KVEC), which leverages both inner- and inter-correlations of items in a tangled key-value sequence through key correlation and value correlation to learn a better sequence representation. |
Tao Duan; Junzhou Zhao; Shuo Zhang; Jing Tao; Pinghui Wang; |
| 80 | A Two-Phase Recall-and-Select Framework for Fast Model Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a two-phase (coarse-recall and fine-selection) model selection framework, aiming to enhance the efficiency of selecting a robust model by leveraging the models’ training performances on benchmark datasets. |
Jianwei Cui; Wenhang Shi; Honglin Tao; Wei Lu; Xiaoyong Du; |
| 81 | BTS: Load-Balanced Distributed Union-Find for Finding Connected Components with Balanced Tree Structures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose BTS, a new fast and scalable distributed Union-Find algorithm for finding connected components in large graphs. |
Chaeeun Kim; Changhun Han; Ha-Myung Park; |
| 82 | Interpretable Knowledge Tracing Via Response Influence-based Counterfactual Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we resort to counterfactual reasoning that intervenes in each response to answer what if a student had answered a question incorrectly that he/she actually answered correctly, and vice versa. |
Jiajun Cui; Minghe Yu; Bo Jiang; Aimin Zhou; Jianyong Wang; Wei Zhang; |
| 83 | Stable Heterogeneous Treatment Effect Estimation Across Out-of-Distribution Populations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In real-world applications, where population distributions are subject to continuous changes, there is an urgent need for stable HTE estimation across out-of-distribution (OOD) populations, which, however, remains an open problem. As pioneers in resolving this problem, we propose a novel Stable Balanced Representation Learning with Hierarchical-Attention Paradigm (SBRL-HAP) framework, which consists of 1) Balancing Regularizer for eliminating selection bias, 2) Independence Regularizer for addressing the distribution shift issue, 3) Hierarchical-Attention Paradigm for coordination between balance and independence. |
Yuling Zhang; Anpeng Wu; Kun Kuang; Liang Du; Zixun Sun; Zhi Wang; |
| 84 | Towards Cross-Domain Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a novel approach called Cross-Domain Continual Learning (CDCL) that addresses the limitations of being limited to single supervised domains. |
Marcus De Carvalho; Mahardhika Pratama; Jie Zhang; Chua Haoyan; Edward Yapp; |
| 85 | DROPP: Structure-Aware PCA for Ordered Data: A General Method and Its Applications in Climate Research and Molecular Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce DROPP, which incorporates order into dimensionality reduction by adapting a Gaussian kernel function across the ordered covariances between data points. |
Anna Beer; Olivér Palotás; Andrea Maldonado; Andrew Draganov; Ira Assent; |
| 86 | Scalable Overspeed Item Detection in Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have pinpointed the inefficiency in allocating memory for all users, recognizing that only a small fraction exhibit overspeed behavior at any given time. Addressing this, we employed the sketching technique, a type of approximate algorithm, and designed the first sketch algorithm for finding Overspeed items, named SpeedSketch: (1) Scalability. |
Yuhan Wu; Hanbo Wu; Chengjun Jia; Bo Peng; Ziyun Zhang; Tong Yang; Peiqing Chen; Kaicheng Yang; Bin Cui; |
| 87 | GradGCL: Gradient Graph Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: From a data engineering view, this assumption fails to deeply mine the graph data and oversimplifies the complexity and heterogeneity of graph data, leading to clustered and redundant representations. To address this issue, we propose GradGCL, a novel method that leverages intrinsic gradient information as an additional input signal to regularize GCL training. |
Ran Li; Shimin Di; Lei Chen; Xiaofang Zhou; |
| 88 | ST-ABC: Spatio-Temporal Attention-Based Convolutional Network for Multi-Scale Lane-Level Traffic Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel lightweight, attention-based, fully convolutional model, named the Spatio-Temporal Attention- Based Convolutional network (ST-ABC), where lane segments are treated as graph nodes and dynamically models the adjacent spatial dependencies using local attention graph convolution. |
Shuhao Li; Yue Cui; Libin Li; Weidong Yang; Fan Zhang; Xiaofang Zhou; |
| 89 | CPDG: A Contrastive Pre-Training Method for Dynamic Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we propose to address them by pre-training and present the Contrastive Pre-Training Method for Dynamic Graph Neural Networks (CPDG). |
Yuanchen Bei; Hao Xu; Sheng Zhou; Huixuan Chi; Haishuai Wang; Mengdi Zhang; Zhao Li; Jiajun Bu; |
| 90 | Graph Anomaly Detection at Group Level: A Topology Pattern Enhanced Unsupervised Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing graph anomaly detection algorithms focus on distinguishing individual entities (nodes or graphs) and overlook the possibility of anomalous groups within the graph. To address this limitation, this paper introduces a novel unsupervised framework for a new task called Group-level Graph Anomaly Detection (Gr-GAD). |
Xing Ai; Jialong Zhou; Yulin Zhu; Gaolei Li; Tomasz P. Michalak; Xiapu Luo; Kai Zhou; |
| 91 | Temporal-Frequency Masked Autoencoders for Time Series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing reconstruction-based methods have demonstrated favorable detection capabilities in the absence of labeled data, they still encounter issues of training bias on abnormal times and distribution shifts within time series. To address these issues, we propose a simple yet effective Temporal-Frequency Masked AutoEncoder (TFMAE) to detect anomalies in time series through a contrastive criterion. |
Yuchen Fang; Jiandong Xie; Yan Zhao; Lu Chen; Yunjun Gao; Kai Zheng; |
| 92 | REGER: Reordering Time Series Data for Regression Encoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to reorder the time series data for better regression encoding. |
Jinzhao Xiao; Wendi He; Shaoxu Song; Xiangdong Huang; Chen Wang; Jianmin Wang; |
| 93 | SAGDFN: A Scalable Adaptive Graph Diffusion Forecasting Network for Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present a Scalable Adaptive Graph Diffusion Forecasting Network (SAGDFN) to capture complex spatial-temporal correlation for large-scale multivariate time series and thereby, leading to exceptional performance in multivariate time series forecasting tasks. |
Yue Jiang; Xiucheng Li; Yile Chen; Shuai Liu; Weilong Kong; Antonis F. Lentzakis; Gao Cong; |
| 94 | Knowledge-Enhanced Recommendation with User-Centric Subgraph Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, most KG-based methods adopt node embeddings, which do not provide personal-ized recommendations for different users and cannot generalize well to the new items. To address these limitations, we propose Knowledge-enhanced User-Centric subgraph Network (KUCNet), a subgraph learning approach with graph neural network (GNN) for effective recommendation. |
Guangyi Liu; Quanming Yao; Yongqi Zhang; Lei Chen; |
| 95 | MUSE-Net: Disentangling Multi-Periodicity for Traffic Flow Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel disentanglement learning network, called MUSE-Net, to tackle the limitations of entanglement learning by simultaneously factorizing the exclusiveness and interaction of multi-periodic patterns in traffic flow. |
Jianyang Qin; Yan Jia; Yongxin Tong; Heyan Chai; Ye Ding; Xuan Wang; Binxing Fang; Qing Liao; |
| 96 | Model Selection with Model Zoo Via Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce TransferGraph, a novel framework that reformulates model selection as a graph learning problem. |
Ziyu Li; Hilco Van Der Wilk; Danning Zhan; Megha Khosla; Alessandro Bozzon; Rihan Hai; |
| 97 | Logical Relation Modeling and Mining in Hyperbolic Space for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose to extract logical relations among item tags from existing tag taxonomies and exploit the individual strengths of the Poincaré and the Lorentz models in hyperbolic space for logical relation modeling towards enhanced recommendations. |
Yanchao Tan; Hang Lv; Zihao Zhou; Wenzhong Guo; Bo Xiong; Weiming Liu; Chaochao Chen; Shiping Wang; Carl Yang; |
| 98 | HeteFedRec: Federated Recommender Systems with Model Heterogeneity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For example, clients with limited training data may prefer to train a smaller recommendation model to avoid excessive data consumption, while clients with sufficient data would benefit from a larger model to achieve higher recommendation accuracy. To address the above challenge, this paper introduces HeteFedRec, a novel FedRec framework that enables the assignment of personalized model sizes to partici-pants. |
Wei Yuan; Liang Qu; Lizhen Cui; Yongxin Tong; Xiaofang Zhou; Hongzhi Yin; |
| 99 | A Compact and Accurate Sketch for Estimating A Large Range of Set Difference Cardinalities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a novel data structure of bit array GXBits to fast and accurately estimate set difference cardinalities in a large range. |
Peng Jia; Pinghui Wang; Rundong Li; Junzhou Zhao; Junlan Feng; Xidian Wang; Xiaohong Guan; |
| 100 | A Unified Model for Spatio-Temporal Prediction Queries with Arbitrary Modifiable Areal Units Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose One4All-ST, a framework that can conduct ST prediction for arbitrary modifiable areal units using only one model. |
Liyue Chen; Jiangyi Fang; Tengfei Liu; Shaosheng Cao; Leye Wang; |
| 101 | Across Images and Graphs for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SVQA that semantically combines the knowledge from available images and graphs to answer the complex question. |
Zhenyu Wen; Jiaxu Qian; Bin Qian; Qin Yuan; Jianbin Qin; Qi Xuan; Ye Yuan; |
| 102 | LightLT: A Lightweight Representation Quantization Framework for Long-Tail Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their advantages, these methods often encounter difficulties in handling long-tail datasets due to imbalanced class distributions. To address this, we propose LightLT, a lightweight representation quantization framework tailored for long-tail datasets. |
Haoyu Wang; Ruirui Li; Zhengyang Wang; Xianfeng Tang; Danqing Zhang; Monica Cheng; Bing Yin; Jasha Droppo; Suhang Wang; Jing Gao; |
| 103 | TSec: An Efficient and Effective Framework for Time Series Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TSec, an innovative time series classification framework that exhibits high training efficiency and classification accuracy for both univariate time series and multivariate time series. |
Yuanyuan Yao; Hailiang Jie; Lu Chen; Tianyi Li; Yunjun Gao; Shiting Wen; |
| 104 | Mccatch: Scalable Microcluster Detection in Dimensional and Nondimensional Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Mccatch: a new algorithm that detects microclusters by leveraging our proposed ‘Oracle’ plot (1NN Distance versus Group 1NN Distance). |
Braulio V. Sánchez Vinces; Robson L. F. Cordeiro; Christos Faloutsos; |
| 105 | Contrastive Learning for Fraud Detection from Noisy Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is well known that the performance of deep learning models can easily degrade because of noisy or inaccurate labels. To tackle this challenge, we propose a supervised Contrastive Learning based Fraud Detection (CLFD) framework, which is designed to operate in the noisy label setting. |
Vinay M. S.; Shuhan Yuan; Xintao Wu; |
| 106 | Adapting Large Language Models By Integrating Collaborative Semantics for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In essence, LLMs capture language semantics while recommender systems imply collaborative semantics, making it difficult to sufficiently leverage the model capacity of LLMs for recommendation. To address this challenge, in this paper, we propose a new LLM-based recommendation model called LC-Rec, which can better integrate language and collaborative semantics for recommender systems. |
Bowen Zheng; Yupeng Hou; Hongyu Lu; Yu Chen; Wayne Xin Zhao; Ming Chen; Ji-Rong Wen; |
| 107 | Effective Data Selection and Replay for Unsupervised Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To further improve the UCL performance, we present a new method in this paper, named Effective Data Selection and Replay (EDSR) for UCL. |
Hanmo LIU; Shimin DI; Haoyang LI; Shuangyin LI; Lei CHEN; Xiaofang ZHOU; |
| 108 | Target-agnostic Source-free Domain Adaptation for Regression Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome it, we propose TASFAR, a novel target-agnostic source-free domain adaptation approach for regression tasks. |
Tianlang He; Zhiqiu Xia; Jierun Chen; Haoliang Li; S.-H. Gary Chan; |
| 109 | Fast Parallel Recovery for Transactional Stream Processing on Multicores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel TSPE called MorphStreamR to achieve fast failure recovery while guaranteeing low performance overhead at runtime. |
Jianjun Zhao; Haikun Liu; Shuhao Zhang; Zhuohui Duan; Xiaofei Liao; Hai Jin; Yu Zhang; |
| 110 | PP-Stream: Toward High-Performance Privacy-Preserving Neural Network Inference Via Distributed Stream Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the expensive cryptographic operations of privacy preservation also pose performance chal-lenges to neural network inference. We address this performance-security tension by designing PP-Stream, a distributed stream processing system for high-performance privacy-preserving neural network inference. |
Qingxiu Liu; Qun Huang; Xiang Chen; Sa Wang; Wenhao Wang; Shujie Han; Patrick P. C. Lee; |
| 111 | AdaEdge: A Dynamic Compression Selection Framework for Resource Constrained Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces AdaEdge, a dynamic, hardware-conscious compression selection framework tailored for resource-constrained devices. |
Chunwei Liu; John Paparrizos; Aaron J. Elmore; |
| 112 | A Predictive Profiling and Performance Modeling Approach for Distributed Stream Processing in Edge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a predictive profiling model to enable measuring the performance of a system by predicting the operators’ processing time on heterogeneous devices without having to carry out the testing on individual devices. |
Hasan Geren; Nasrin Sohrabi; Zahir Tari; Nour Moustafa; |
| 113 | Joint Mobile Edge Caching and Pricing: A Mean-Field Game Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the competitive content placement problem in Mobile Edge Caching (MEC) systems, where Edge Data Providers (EDPs) cache appropriate contents and trade them with requesters at a suitable price. |
Yin Xu; Xichong Zhang; Mingjun Xiao; Jie Wu; An Liu; Sheng Zhang; |
| 114 | Online Container Caching with Late-Warm for IoT Data Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study online container caching in serverless edge computing to minimize the total latency with Late-Warm and other practical issues considered. |
Guopeng Li; Haisheng Tan; Xuan Zhang; Chi Zhang; Ruiting Zhou; Zhenhua Han; Guoliang Chen; |
| 115 | COUPLE: Orchestrating Video Analytics on Heterogeneous Mobile Processors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce COUPLE, an orchestration framework for video analytics on heterogeneous mobile processors, with the goal of optimizing real-time video analysis through the collaboration of CPU, GPU and DSP. |
Hao Bao; Zhi Zhou; Fei Xu; Xu Chen; |
| 116 | Multiple Continuous Top-K Queries Over Data Stream Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel index PH-Tree (Partition and Heap-based Binary Tree), designed to facilitate multiple continuous top-k queries. |
Rui Zhu; Yujin Jia; Xiaochun Yang; Baihua Zheng; Bin Wang; Chuanyu Zong; |
| 117 | CodingSketch: A Hierarchical Sketch with Efficient Encoding and Recursive Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill the gap, we propose a new sketch called Coding Sketch. |
Qizhi Chen; Yisen Hong; Yuhan Wu; Tong Yang; Bin Cui; |
| 118 | Everything Everyway All at Once – Time Traveling Debugging for Stream Processing Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Debugging, evaluating, and optimizing stream processing applications is challenging due to continuous streams of input data and typically distributed and parallel execution environments. To address these issues, we present an approach for explorative debugging of stream processing pipelines that allows in-depth investigation of a pipeline’s execution behavior and evolution. |
Timo Räth; Marius Schlegel; Kai-Uwe Sattler; |
| 119 | LDPRecover: Recovering Frequencies from Poisoning Attacks Against Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose LDPRecover, a method that can recover accurate aggregated frequencies from poisoning attacks, even if the server does not learn the details of the attacks. |
Xinyue Sun; Qingqing Ye; Haibo Hu; Jiawei Duan; Tianyu Wo; Jie Xu; Renyu Yang; |
| 120 | Differentially Private Graph Neural Networks for Link Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this work we propose a differentially private link prediction (DPLP) framework, building upon subgraph-based GNNs. |
Xun Ran; Qingqing Ye; Haibo Hu; Xin Huang; Jianliang Xu; Jie Fu; |
| 121 | Secure and Practical Functional Dependency Discovery in Outsourced Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on a typical maintenance task – functional dependency (FD) discovery. |
Xinle Cao; Yuhan Li; Dmytro Bogatov; Jian Liu; Kui Ren; |
| 122 | SecMdp: Towards Privacy-Preserving Multimodal Deep Learning in End-Edge-Cloud Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SecMdp, an SGX-assisted secure computational framework for multimodal data in the EEC architecture. |
Zhao Bai; Mingyue Wang; Fangda Guo; Yu Guo; Chengjun Cai; Rongfang Bie; Xiaohua Jia; |
| 123 | CARGO: Crypto-Assisted Differentially Private Triangle Counting Without Trusted Servers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper introduces a crypto-assisted differentially private triangle counting system, named CARGO, leveraging cryptographic building blocks to improve the effectiveness of differentially private triangle counting without assumption of trusted servers. |
Shang Liu; Yang Cao; Takao Murakami; Jinfei Liu; Masatoshi Yoshikawa; |
| 124 | Real-Time Trajectory Synthesis with Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose RetraSyn, a novel real-time trajectory synthesis framework, which is able to perform on-the-f1y trajectory synthesis based on the mobility patterns privately extracted from users’ trajectory streams. |
Yujia Hu; Yuntao Du; Zhikun Zhang; Ziquan Fang; Lu Chen; Kai Zheng; Yunjun Gao; |
| 125 | Privacy-Preserving Traffic Flow Release with Consistency Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose post-processing techniques, which exploit the data’s inherent relations for corrections over the global and local differentially private traffic flow data, respectively. |
Xiaoting Zhu; Libin Zheng; Chen Jason Zhang; Peng Cheng; Rui Meng; Lei Chen; Xuemin Lin; Jian Yin; |
| 126 | Unraveling Privacy Risks of Individual Fairness in Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we pioneer the exploration of the interaction between the privacy risks of edge leakage and the individual fairness of a GNN. |
He Zhang; Xingliang Yuan; Shirui Pan; |
| 127 | Sketches-Based Join Size Estimation Under Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle the noise error caused by protecting sensitive join values with large domains, we introduce a novel algorithm called LDPJoinSketch for sketch-based join size estimation under LDP. |
Meifan Zhang; Xin Liu; Lihua Yin; |
| 128 | PrivShape: Extracting Shapes in Time Series Under User-Level Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PrivShape, a trie-based mechanism under user-level LDP to protect all elements. |
Yulian Mao; Qingqing Ye; Haibo Hu; Qi Wang; Kai Huang; |
| 129 | SparDL: Distributed Deep Learning Training with Efficient Sparse Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, to further reduce the latency cost and improve the efficiency of SparDL, we propose the Spar-All-Gather algorithm. |
Minjun Zhao; Yichen Yin; Yuren Mao; Qing Liu; Lu Chen; Yunjun Gao; |
| 130 | Metasql: A Generate-Then-Rank Framework for Natural Language to SQL Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Metasql, a unified generate-then-rank framework that can be flexibly incorporated with existing NLIDBs to consistently improve their translation accuracy. |
Yuankai Fan; Zhenying He; Tonghui Ren; Can Huang; Yinan Jing; Kai Zhang; X. Sean Wang; |
| 131 | Feed: Towards Personalization-Effective Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel PFL solution, Feed, that employs an enhanced shared-private model architecture and equips with a hybrid federated training strategy. |
Pengpeng Qiao; Kangfei Zhao; Bei Bi; Zhiwei Zhang; Ye Yuan; Guoren Wang; |
| 132 | T-Rex (Tree-Rectangles): Reformulating Decision Tree Traversal As Hyperrectangle Enclosure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a system that trades many random I/Os for few sequential I/O by remapping a forest of trees into a single spatial index. |
Meghana Madhyastha; Tamas Budavari; Vladmir Braverman; Joshua Vogelstein; Randal Burns; |
| 133 | FeatAug: Automatic Feature Augmentation From One-to-Many Relationship Tables Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, it does not include predicates in these queries, which significantly limits its application in many real-world scenarios. To overcome this limitation, we propose FEATAuG, a new feature augmentation framework that automatically extracts predicate-aware SQL queries from one-to-many relationship tables. |
Danrui Qi; Weiling Zheng; Jiannan Wang; |
| 134 | AutoMC: Automated Model Compression Based on Domain Knowledge and Progressive Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To make more users easily access to the model compression scheme that best meet their needs, in this paper, we propose AutoMC, an effective and efficient automatic tool for model compression. |
Chunnan Wang; Hongzhi Wang; Xiangyu Shi; |
| 135 | Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes KG-TOSA, an approach to automate the TOSG extraction for task-oriented HGNN training on a large KG. |
Hussein Abdallah; Waleed Afandi; Panos Kalnis; Essam Mansour; |
| 136 | Clients Help Clients: Alternating Collaboration for Semi-Supervised Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing SSFL methods ignore two inherent characteristics of FL: limited communication resources and heterogeneous data distribution, which severely hinder convergence stability and efficiency. This paper proposes a novel SSFL mechanism, called FedAC, to address the above two challenges by alternating client-to-client (C2C) collaboration. |
Zhida Jiang; Yang Xu; Hongli Xu; Zhiyuan Wang; Chunming Qiao; |
| 137 | AutoFeat: Transitive Feature Discovery Over Join Paths Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel ranking-based feature discovery method called AutoFeat. |
Andra Ionescu; Kiril Vasilev; Florena Buse; Rihan Hai; Asterios Katsifodimos; |
| 138 | Triple-D: Denoising Distant Supervision for High-Quality Data Creation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Triple-d, a technique for high-quality data creation through adaptive pattern replacement and a scalable non-parametric model. |
Xinyi Zhu; Yongqi Zhang; Lei Chen; Kai Chen; |
| 139 | Efficient Partial Order Based Transaction Processing for Permissioned Blockchains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the consensus phase, we propose a consensus algorithm to conduct the maximal common subgraph, CPGraph, based on the PGraphs of different nodes. |
Shuai Zhao; Zhiwei Zhang; Junkai Wang; Ye Yuan; Meihui Zhang; Guoren Wang; Jiang Xiao; |
| 140 | TELL: Efficient Transaction Execution Protocol Towards Leaderless Consensus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, we propose Dynamic Commitment Epoch (DCE) to adapt to instances’ running status and decrease blocks’ committing latency. |
Xing Tong; Zheming Ye; Zhao Zhang; Cheqing Jin; Aoying Zhou; |
| 141 | SpotLess: Concurrent Rotational Consensus Made Practical Through Rapid View Synchronization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present SpotLess, a novel concurrent rotational consensus protocol made practical. |
Dakai Kang; Sajjad Rahnama; Jelle Hellings; Mohammad Sadoghi; |
| 142 | PrestigeBFT: Revolutionizing View Changes in BFT Consensus Algorithms with Reputation Mechanisms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Passive view-change protocols are widely employed in BFT algorithms; however, they present the risks of selecting unavailable or slow servers as leaders. To tackle these challenges, we propose PrestigeBFT, a novel BFT consensus algorithm that incorporates an active view-change protocol with reputation mechanisms. |
Gengrui Zhang; Fei Pan; Sofia Tijanic; Hans-Arno Jacobsen; |
| 143 | Porygon: Scaling Blockchain Via 3D Parallelism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Porygon, a novel stateless blockchain with three-dimensional (3D) parallelism. |
Wuhui Chen; Ding Xia; Zhongteng Cai; Hong-Ning Dai; Jianting Zhang; Zicong Hong; Junyuan Liang; Zibin Zheng; |
| 144 | Authenticated Keyword Search on Large-Scale Graphs in Hybrid-Storage Blockchains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Merkle Path DAG (MP-DAG), a novel ADS that aggregates the unqualified paths that will not appear in the result trees to efficiently handle authenticated keyword search queries on graphs. |
Siyu Li; Zhiwei Zhang; Jiang Xiao; Meihui Zhang; Ye Yuan; Guoren Wang; |
| 145 | MuFuzz: Sequence-Aware Mutation and Seed Mask Guidance for Blockchain Smart Contract Fuzzing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we shed light on smart contract fuzzing by employing a sequence-aware mutation and seed mask guidance strategy. |
Peng Qian; Hanjie Wu; Zeren Du; Turan Vural; Dazhong Rong; Zheng Cao; Lun Zhang; Yanbin Wang; Jianhai Chen; Qinming He; |
| 146 | Authenticated Subgraph Matching in Hybrid-Storage Blockchains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to support authenticated subgraph matching queries for large graphs kept off-chain. |
Siyu Li; Zhiwei Zhang; Meihui Zhang; Ye Yuan; Guoren Wang; |
| 147 | V2FS : A Verifiable Virtual Filesystem for Multi-Chain Query Authentication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, multi-chain queries pose several challenges for the querying system, including compatibility with existing blockchains, supporting diverse query types, and ensuring the integrity of query results. To tackle these challenges, we propose a novel paradigm called verifiable virtual filesystem (V2FS). |
Haixin Wang; Cheng Xu; Xiaojie Chen; Ce Zhang; Haibo Hu; Shikun Tian; Ying Yan; Jianliang Xu; |
| 148 | Lion: Minimizing Distributed Transactions Through Adaptive Replica Provision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Lion, a novel transaction processing protocol that utilizes partition-based replication to reduce the occurrence of distributed transactions. |
Qiushi Zheng; Zhanhao Zhao; Wei Lu; Chang Yao; Yuxing Chen; Anqun Pan; Xiaoyong Du; |
| 149 | FC: Adaptive Atomic Commit Via Failure Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FC, a novel and practical ACP that can adapt to changes in failure conditions. |
Hexiang Pan; Quang-Trung Ta; Meihui Zhang; Zhanhao Zhao; Yeow Meng Chee; Gang Chen; Beng Chin Ooi; |
| 150 | ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces ZEROTuNE, a novel cost model for parallel and distributed stream processing that can be used to effectively set initial parallelism degrees of streaming queries. |
Pratyush Agnihotri; Boris Koldehofe; Paul Stiegele; Roman Heinrich; Carsten Binnig; Manisha Luthra; |
| 151 | MergeSFL: Split Federated Learning with Feature Merging and Batch Size Regulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite resource limitations, SFL also faces two other critical challenges in EC systems, i.e., statistical heterogeneity and system heterogeneity. In order to address these challenges, we propose a novel SFL framework, termed MergeSFL, by incorporating feature merging and batch size regulation in SFL. |
Yunming Liao; Yang Xu; Hongli Xu; Lun Wang; Zhiwei Yao; Chunming Qiao; |
| 152 | SharDAG: Scaling DAG-Based Blockchains Via Adaptive Sharding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SharDAG, a new mechanism that leverages adaptive sharding for DAG-based blockchains to achieve high performance and strong consistency. |
Feng Cheng; Jiang Xiao; Cunyang Liu; Shijie Zhang; Yifan Zhou; Bo Li; Baochun Li; Hai Jin; |
| 153 | Boosting Write Performance of KV Stores: An NVM – Enabled Storage Collaboration Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents ZigZagDB, an NVM-enabled data man-agement scheme for LSM-tree-based key-value stores. |
Yi Wang; Jiajian He; Kaoyi Sun; Yunhao Dong; Jiaxian Chen; Chenlin Ma; Amelie Chi Zhou; Rui Mao; |
| 154 | Log Replaying for Real-Time HTAP: An Adaptive Epoch-Based Two-Stage Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents AETS, an Adaptive Epoch-based Two-Stage log replay framework that implements epoch-based log replay and table group transaction commit. |
Jun-Peng Zhu; Zhiwei Ye; Peng Cai; Donghui Wang; Fengyan Zhang; Dunbo Cai; Ling Qian; |
| 155 | FSD-Inference: Fully Serverless Distributed Inference with Scalable Cloud Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce novel fully serverless communication schemes for ML inference workloads, leveraging both cloud-based publish-subscribe/queueing and object storage offerings. |
Joe Oakley; Hakan Ferhatosmanoglu; |
| 156 | Graph Computation with Adaptive Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, users often struggle to write and optimize new parallel algorithms to fit different programming abstractions, which can be a daunting task. To address these challenges, this paper introduces Argan, a parallel graph system that offers efficient adaptive-grained executions and a user-friendly abstraction. |
Ruiqi Xu; Yue Wang; Xiaokui Xiao; |
| 157 | FedCross: Towards Accurate Federated Learning Via Multi-Model Cross-Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, since only one global model cannot always accommodate all the incompatible convergence directions of local models, existing FL approaches greatly suffer from inferior classification accuracy. To address this issue, we present an efficient FL framework named FedCross, which uses a novel multi-to-multi FL training scheme based on our proposed multi-model cross-aggregation approach. |
Ming Hu; Peiheng Zhou; Zhihao Yue; Zhiwei Ling; Yihao Huang; Anran Li; Yang Liu; Xiang Lian; Mingsong Chen; |
| 158 | Mitigating Subgroup Unfairness in Machine Learning Classifiers: A Data-Driven Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate ways to improve subgroup fairness where subgroups are defined by the intersection of protected attributes. |
Yin Lin; Samika Gupta; H. V. Jagadish; |
| 159 | Non-Invasive Fairness in Learning Through The Lens of Data Drift Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the performance of such a multi-model strategy can degrade severely under poor representation of some groups in the data. We thus propose a single-model, reweighing strategy, ConFair, to overcome this limitation. |
Ke Yang; Alexandra Meliou; |
| 160 | Preventing The Popular Item Embedding Based Attack in Federated Recommendations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Privacy concerns have led to the rise of federated recommender systems (FRS), which can create personalized models across distributed clients. |
Jun Zhang; Huan Li; Dazhong Rong; Yan Zhao; Ke Chen; Lidan Shou; |
| 161 | Explainable Disparity Compensation for Efficient Fair Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose easily explainable data-driven compensatory measures for ranking functions. |
Abraham Gale; Amélie Marian; |
| 162 | Generating Explanations to Understand and Repair Embedding-Based Entity Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Entity alignment (EA) seeks identical entities in different knowledge graphs, which is a long-standing task in the database research. |
Xiaobin Tian; Zequn Sun; Wei Hu; |
| 163 | Enhancing The Rationale-Input Alignment for Self-explaining Rationalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we discover that rationalization is prone to a problem named rationale shift, which arises from the algorithmic bias of the cooperative game. |
Wei Liu; Haozhao Wang; Jun Wang; Zhiying Deng; Yuankai Zhang; Cheng Wang; Ruixuan Li; |
| 164 | Model Trip: Enhancing Privacy and Fairness in Model Fusion Across Multi-Federations for Trustworthy Global Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These challenges primarily encompass privacy, population shift and data dependency, which may lead to severe consequences such as the leakage of sensitive information within models and training samples, unfair model performance and resource burdens. To tackle these issues, we propose FairFusion, a cross-federation model fusion approach that enhances privacy and fairness. |
Qian Chen; Yiqiang Chen; Bingjie Yan; Xinlong Jiang; Xiaojin Zhang; Yan Kang; Teng Zhang; Wuliang Huang; Chenlong Gao; Lixin Fan; Qiang Yang; |
| 165 | Why-Not Explainable Graph Recommender Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Extending the notion of explainable RS, in this paper we define Why-Not explanations for recommendations that were expected but not returned, and propose and implement a technique for computing Why-Not explanations in a post-hoc manner for a graph-based RS. |
Hervé-Madelein Attolou; Katerina Tzompanaki; Kostas Stefanidis; Dimitris Kotzinos; |
| 166 | GAGE: Genetic Algorithm-Based Graph Explainer for Malware Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The proposed research introduces a structured pipeline for reverse engineering-based analysis, offering promising results compared to state-of-the-art methods and providing high-level interpretability for malicious code blocks in subgraphs. |
Mohd Saqib; Benjamin C.M. Fung; Philippe Charland; Andrew Walenstein; |
| 167 | Accurate Explanation Model for Image Classifiers Using Class Association Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we propose a generative explanation model that combines the advantages of global and local knowledge for explaining image classifiers. |
Ruitao Xie; Jingbang Chen; Limai Jiang; Rui Xiao; Yi Pan; Yunpeng Cai; |
| 168 | Fairgen: Towards Fair Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to tailor graph generation to downstream mining tasks by leveraging label information and user-preferred parity constraints. |
Lecheng Zheng; Dawei Zhou; Hanghang Tong; Jiejun Xu; Yada Zhu; Jingrui He; |
| 169 | Fast, Robust and Interpretable Participant Contribution Estimation for Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce CTFL, a fair, robust, and interpretable framework designed to estimate clients’ contributions to federated learning, aiming to incentivize high-quality data providers to participate in the federation. |
Yong Wang; Kaiyu Li; Yuyu Luo; Guoliang Li; Yunyan Guo; Zhuo Wang; |
| 170 | Exploring Optimal Parameters for Expected Results on Radius-Bounded K-Core Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the query parameters $k$ and $r$ are hard to specify by the users without any background knowledge, which means the query results often do not meet the users’ requirements, i.e., some expected vertices are missed in the query results. To tackle this issue, we investigate the problem of exploring optimal refined parameters (EOP) for expected results on RB $k$ -core queries, which aims to explore the optimal parameters that make the expected vertex $\omega$ and query vertex $q$ appear in the same RB- $k$ -core. |
Chuanyu Zong; Zefang Dong; Xiaochun Yang; Bin Wang; Huaijie Zhu; Tao Qiu; Rui Zhu; |
| 171 | Explaining Entity Matching with Clusters of Words Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CREW, an explanation system for Entity Matching models that combines the comprehensibility of the explanations and fidelity to the model. |
Riccardo Benassi; Francesco Guerra; Matteo Paganelli; Donato Tiano; |
| 172 | Fair Top-k Query on Alpha-Fairness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient exact framework with a basic implementation and an improved implementation to find the fairest utility function with the minimum modification penalty. |
Hao Liu; Raymond Chi-Wing Wong; Zheng Zhang; Min Xie; Bo Tang; |
| 173 | Butterfly Counting Over Bipartite Graphs with Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To obtain unbiased butterfly counts, we propose a multiple-round interaction algorithm to allow the vertices to download the noisy graph and compute local motif counts. |
Yizhang He; Kai Wang; Wenjie Zhang; Xuemin Lin; Wei Ni; Ying Zhang; |
| 174 | Temporal Graph Generation Featuring Time-Bound Communities Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Synthetic graph datasets are crucial for the assessment of network analysis algorithms, providing a measure of their effectiveness and efficiency. However, most existing … |
Shuwen Zheng; Chaokun Wang; Cheng Wu; Yunkai Lou; Hao Feng; Xuran Yang; |
| 175 | Breaking The Entanglement of Homophily and Heterophily in Semi-supervised Node Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The neglect of directed edges results in sub-optimal graph representations, thereby hindering the capacity of GNNs. To address this issue, we introduce AMUD, which quantifies the relationship between node profiles and topology from a statistical perspective, offering valuable insights for Adaptively Modeling the natural directed graphs as the Undirected or Directed graph to maximize the benefits from subsequent graph learning. |
Henan Sun; Xunkai Li; Zhengyu Wu; Daohan Su; Rong-Hua Li; Guoren Wang; |
| 176 | Efficient Core Decomposition Over Large Heterogeneous Information Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on two kinds of sparse matrix products, we propose two kinds of algebraic core decomposition algorithms, which are suitable for general HINs and locally dense HINs, respectively. |
Yucan Guo; Chenhao Ma; Yixiang Fang; |
| 177 | Accelerating SpMV for Scale-Free Graphs with Optimized Bins Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach called Binn to enhance SpMV performance for scale-free graphs on modern multicore processors. |
YuAng Chen; Jeffery Xu Yu; |
| 178 | PlatoD2GL: An Efficient Dynamic Deep Graph Learning System for Graph Neural Network Training on Billion-Scale Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The state-of-the-art suffers from two issues: (1) expensive memory consumption due to the huge indexing overhead of numerous key-value pairs in traditional key-value topology storage and (2) inefficient dynamic updating due to the heavy updates on indexing structures for weighted neighbor sampling. In this paper, we proposed a Dynamic Graph-based Learning System PiatoD2GL to address above two issues. |
Xing Huang; Dandan Lin; Weiyi Huang; Shijie Sun; Jie Wen; Chuan Chen; |
| 179 | Quantum Algorithms for The Maximum K-Plex Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The state-of-the-art has reduced the complexity from a trivial $O^{*}(2^{n})$ to $O^{*}(c_{k}^{n})$, with $c_{k} > 1.94$ for $k$ > 3, where $n$ denotes the number of vertices. In this paper, we demonstrate that MKP can be solved in $O^{*}(1.42^{n})$ and propose the first two quantum algorithms, qTKP and qMKP, to achieve this complexity. |
Xiaofan Li; Gao Cong; Rui Zhou; |
| 180 | Fast Iterative Graph Computing with Updated Neighbor States Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a graph reordering method, GoGraph, which can construct a well-formed vertex processing order effectively reducing the number of iteration rounds and, consequently, accelerating iterative computation. |
Yijie Zhou; Shufeng Gong; Feng Yao; Hanzhang Chen; Song Yu; Pengxi Liu; Yanfeng Zhang; Ge Yu; Jeffrey Xu Yu; |
| 181 | Querying Numeric-Constrained Shortest Distances on Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it often overlooks the benefits of setting an upper bound $r$ (within the interval [l, r]). To bridge this gap, we introduce the numeric-constrained shortest-distance query problem, which enforces interval constraints [l, r] on the numeric attributes of edges on a path. |
Mingyu Yang; Wentao Li; Wei Wang; Dong Wen; Lu Qin; |
| 182 | Mining Quasi-Periodic Communities in Temporal Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To efficiently compute the quasi-periodic communities, we propose a novel two-stage framework. |
Yue Zeng; Hongchao Qin; Rong-Hua Li; Kai Wang; Guoren Wang; Xuemin Lin; |
| 183 | GraphRARE: Reinforcement Learning Enhanced Graph Neural Network with Relative Entropy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing methods have the homogeneity assumption and show poor performance on heterophilic graphs, where the linked nodes have dissimilar features and different class labels, and the semantically related nodes might be multi-hop away. To address this limitation, this paper presents GraphRARE, a general framework built upon node relative entropy and deep reinforcement learning, to strengthen the expressive capability of GNNs. |
Tianhao Peng; Wenjun Wu; Haitao Yuan; Zhifeng Bao; Zhao Pengru; Xin Yu; Xuetao Lin; Yu Liang; Yanjun Pu; |
| 184 | Querying Historical Cohesive Subgraphs Over Temporal Bipartite Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the first cohesive subgraph model $(\alpha,\ \beta,\ \mathcal{T})$-core on temporal bipartite graphs. |
Shunyang Li; Kai Wang; Xuemin Lin; Wenjie Zhang; Yizhang He; Long Yuan; |
| 185 | AdaFGL: A New Paradigm for Federated Node Classification with Topology Heterogeneity Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, Federated Graph Learning (FGL) has attracted significant attention as a distributed framework based on graph neural networks, primarily due to its capability to break … |
Xunkai Li; Zhengyu Wu; Wentao Zhang; Henan Sun; Rong-Hua Li; Guoren Wang; |
| 186 | Positive Communities on Signed Graphs That Are Not Echo Chambers: A Clique-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we propose a signed graph community substructure named the $(\epsilon,\ \phi)$ -Clique which is the best of both worlds, where each user is happy to be in their community (indicated by have a proportion of positive edges $\geq\epsilon$ for each node) as well as there existing a level of disagreement in the system (indicated by the community having a proportion of negative edges $\geq \phi$). |
Alexander Zhou; Yue Wang; Lei Chen; M. Tamer Özsu; |
| 187 | Maximal Biclique Enumeration: A Prefix Tree Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel and highly-efficient algorithm for maximal biclique enumeration in bipartite graphs using prefix trees. |
Jiujian Chen; Kai Wang; Rong-Hua Li; Hongchao Qin; Xuemin Lin; Guoren Wang; |
| 188 | Batch Hop-Constrained S-t Simple Path Query Processing in Large Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in practice, it is more often that multiple H C-s-t path queries are issued simultaneously and processed as a batch. Therefore, we study the problem of batch H C-s-t path query processing in this paper and aim to compute the results of all queries concurrently and efficiently as a batch. |
Long Yuan; Kongzhang Hao; Xuemin Lin; Wenjie Zhang; |
| 189 | On Searching Maximum Directed $(k, \ell)$-Plex Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the maximum directed $(k, \ell)$ -plex search problem which finds a directed $(k, \ell)$ -plex with the most vertices. |
Shuohao Gao; Kaiqiang Yu; Shengxin Liu; Cheng Long; Zelong Qiu; |
| 190 | Masked Graph Modeling with Multi- View Contrast Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such a local perspective disregards the graph’s global information and structure. To address these limitations, we propose a novel graph pre-training framework called Graph Contrastive Masked Autoencoder (GCMAE). |
Yanchen Luo; Sihang Li; Yongduo Sui; Junkang Wu; Jiancan Wu; Xiang Wang; |
| 191 | Multi- View Teacher with Curriculum Data Fusion for Robust Unsupervised Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Graph Neural Networks (GNNs) have emerged as an effective tool for graph classification, yet their reliance on extensive labeled data poses a significant challenge, especially when such labels are scarce. To address this challenge, this paper presents a novel framework, denoted as Multi-View Teacher with Curriculum Data Fusion (MTDF). |
Yuhao Tang; Junyu Luo; Ling Yang; Xiao Luo; Wentao Zhang; Bin Cui; |
| 192 | GShop: Towards Flexible Pricing for Graph Statistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current query-based pricing frameworks cannot be applied to price graph statistics, as they fail to consider buyers’ affordability and prevent arbitrage trading. To address this gap, in this paper, we propose a novel framework GSHOP for pricing graph statistic queries. |
Chen Chen; Ye Yuan; Zhenyu Wen; Yu-Ping Wang; Guoren Wang; |
| 193 | LearnSC: An Efficient and Unified Learning-Based Framework for Subgraph Counting Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an efficient and unified deep learning-based solution framework LearnSC, which solves the subgraph counting problem approximately. |
Wenzhe Hou; Xiang Zhao; Bo Tang; |
| 194 | AFTER: Adaptive Friend Discovery for Temporal-Spatial and Social-Aware XR Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel scenario of socializing in social XR, which has the potential to substantially enhance traditional social media through i) the recommendation of appropriate surrounding users that cater to users’ individual preferences, ii) the adaptive avoidance of view occlusions to facilitate users in locating their friends, iii) the consideration of users’ social presence, and iv) the development of cross-platform solutions to provide hybrid participation. |
Bing-Jyue Chen; Chiok Yew Ho; De-Nian Yang; |
| 195 | Reducing Resource Usage for Continuous Model Updating and Predictive Query Answering in Graph Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe the need for continuous, online training of dynamic graph neural network (DGNN) models while at the same time using them to answer continuous predictive queries as data streams in. |
Qu Liu; Adam King; Tingjian Ge; |
| 196 | Graph Anomaly Detection with Domain-Agnostic Pre-Training and Few-Shot Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike the total neglect of valuable labeled anomalies in unsupervised approaches or the potential overfitting in supervised approaches, we proposed a few-shot-oriented framework GUDI in this paper. |
Xujia Li; Lei Chen; |
| 197 | NC-ALG: Graph-Based Active Learning Under Noisy Crowd Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, due to this impractical assumption, existing works only focus on optimizing the node selection in AL but neglect optimizing the labeling process. Therefore, we present NC-ALG, the first GNN-based AL framework that optimizes both the node selection and node labeling process under a noisy crowd. |
Wentao Zhang; Yexin Wang; Zhenbang You; Yang Li; Gang Cao; Zhi Yang; Bin Cui; |
| 198 | Fast Multilayer Core Decomposition and Indexing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing ML core decomposition algorithms face performance issues due to unavoidably unnecessary computations and are inherently serial, unable to fully leverage the multi-core processors. In this paper, we reformulate the search space of this problem with a tree-shaped structure called MLC-tree. |
Dandan Liu; Run-An Wang; Zhaonian Zou; Xin Huang; |
| 199 | CINA: Curvature-Based Integrated Network Alignment with Hypergraph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the disparity and diversity based on distinct structural patterns of ubiquitous anchor links. |
Pengfei Jiao; Yuanqi Liu; Yinghui Wang; Ge Zhang; |
| 200 | Open-World Semi-Supervised Learning for Node Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an IMbalance-A ware method named OpenIMA for Open-world semi-supervised node classification, which trains the node classification model from scratch via contrastive learning with bias-reduced pseudo labels. |
Yanling Wang; Jing Zhang; Lingxi Zhang; Lixin Liu; Yuxiao Dong; Cuiping Li; Hong Chen; Hongzhi Yin; |
| 201 | Scalable Community Search with Accuracy Guarantee on Attributed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formally define our CS-AG problem atop a $q- \mathbf{centric}$ attribute cohesiveness metric considering both textual and numerical attributes, for $k-\mathbf{core}$ model on homogeneous graphs. |
Yuxiang Wang; Shuzhan Ye; Xiaoliang Xu; Yuxia Geng; Zhenghe Zhao; Xiangyu Ke; Tianxing Wu; |
| 202 | From Motif to Path: Connectivity and Homophily Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As the traditional BFS or DFS approaches are not valid anymore, we develop a disjoint set algorithm instead. |
Qihao Wang; Hongtai Cao; Xiaodong Li; Kevin Chen-Chuan Chang; Reynold Cheng; |
| 203 | Self-Training GNN-based Community Search in Large Attributed Heterogeneous Information Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, semi-supervised learning methods demand substantial labeled data and incur considerable memory and time costs when applied to large AHINs. To tackle these challenges, we propose a MK (Most-likely; K-sized) community search approach. |
Yuan Li; Xiuxu Chen; Yuhai Zhao; Wen Shan; Zhengkui Wang; Guoli Yang; Guoren Wang; |
| 204 | HGAMLP: Heterogeneous Graph Attention MLP with De-Redundancy Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, they bury the graph structure information of the higher-order meta-paths and fail to explore deeper graph structure information. In this paper, we address these two limitations and propose a new non-parametric HGNN framework called Heterogeneous Graph Attention Multi-Layer Perceptron (HGAMLP). |
Yuxuan Liang; Wentao Zhang; Zeang Sheng; Ling Yang; Jiawei Jiang; Yunhai Tong; Bin Cui; |
| 205 | FocusCore Decomposition of Multilayer Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel dense subgraph model called FocusCore (FoCore) for multilayer graphs, which can pay more attention to layers focused on by users. |
Run-An Wang; Dandan Liu; Zhaonian Zou; |
| 206 | Search to Fine-Tune Pre-Trained Graph Neural Networks for Graph-Level Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To further boost pre-trained GNNs, we propose to search to fine-tune pre-trained GNNs for graph-level tasks (S2PGNN), which can adaptively design a suitable fine-tuning framework for the given pre-trained GNN and downstream data. |
Zhili WANG; Shimin DI; Lei CHEN; Xiaofang ZHOU; |
| 207 | BOURNE: Bootstrapped Self-Supervised Learning Framework for Unified Graph Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, state-of-the-art GAD methods, such as CoLA and SL-GAD, heavily rely on negative pair sampling in contrastive learning, which incurs high computational costs, hindering their scalability to large graphs. To address these limitations, we propose a novel unified graph anomaly detection framework based on bootstrapped self-supervised learning (named BOURNE). |
Jie Liu; Mengting He; Xuequn Shang; Jieming Shi; Bin Cui; Hongzhi Yin; |
| 208 | Discovering Personalized Characteristic Communities in Attributed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the novel problem of Characteristic cOmmunity Discovery (COD) in attributed graphs. |
Yudong Niu; Yuchen Li; Panagiotis Karras; Yanhao Wang; Zhao Li; |
| 209 | TP-GNN: Continuous Dynamic Graph Neural Network for Graph Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the limitations of current approaches, this paper proposes TP-GNN, a novel continuous dynamic graph neural network model intended for graph classification in dynamic networks, which offers two primary advantages: (1) TP-GNN captures the long temporal dependencies via a novel message-passing method based on the information flow among the nodes, and (2) it learns the network evolution process from edge order for accurate dynamic network analytics. |
Jie Liu; Jiamou Liu; Kaiqi Zhao; Yanni Tang; Wu Chen; |
| 210 | GraphHI: Boosting Graph Neural Networks for Large-Scale Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To integrate multiple sources of hidden insights, we propose ALC, an algorithm that dynamically sets appropriate combination coefficients for various loss terms. |
Hao Feng; Chaokun Wang; Ziyang Liu; Yunkai Lou; Zhenyu Liu; Xiaokun Zhu; Yongjun Bao; Weipeng Yan; |
| 211 | DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, for graph-level learning, the vanilla contrastive framework cannot reflect the distinction between the in-batch negatives. To alleviate this issue, we propose RankGCL, which enables the contrastive framework to capture the similarity ranking information between graphs and shows special superiority in graph similarity-based practical tasks. |
Jun Xia; Shaorong Chen; Yue Liu; Zhangyang Gao; Jiangbin Zheng; Xihong Yang; Stan Z. Li; |
| 212 | Incorporating Dynamic Temperature Estimation Into Contrastive Learning on Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate how to generate high-quality contrastive node embeddings based on an in-depth analysis of graph contrastive losses. |
Ziyang Liu; Chaokun Wang; Liqun Yang; Yunkai Lou; Hao Feng; Cheng Wu; Kai Zheng; Yang Song; |
| 213 | Newton Sketches: Estimating Node Intimacy in Dynamic Graphs Using Newton’s Law of Cooling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Because Intimacy varies in every time unit, the main challenge lies in how to record and update the Intimacy efficiently. In this paper, we propose a novel technique named Newton-Observe to address this challenge. |
Qizhi Chen; Ke Wang; Aoran Li; Yuhan Wu; Tong Yang; Bin Cui; |
| 214 | Counting Butterflies in Fully Dynamic Bipartite Graph Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Abacus, a novel approximate algorithm that counts butterflies in the presence of both insertions and deletions by utilizing sampling. |
Serafeim Papadias; Zoi Kaoudi; Varun Pandey; Jorge-Arnulfo Quiané-Ruiz; Volker Markl; |
| 215 | BIM: Improving Graph Neural Networks with Balanced Influence Maximization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although this problem has been well researched from the view of imbalanced class samples, we further argue that graph neural networks (GNNs) expose a unique source of imbalance from the influenced nodes of different classes of labeled nodes, i.e., labeled nodes are imbalanced in terms of the number of nodes they influenced during the influence propagation in GNNs. To tackle this previously unexplored influence-imbalance issue, we connect social influence maximization with the imbalanced node classification problem and propose balanced influence maximization (BIM). |
Wentao Zhang; Xinyi Gao; Ling Yang; Meng Cao; Ping Huang; Jiulong Shan; Hongzhi Yin; Bin Cui; |
| 216 | SES: Bridging The Gap Between Explainability and Prediction of Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the aforementioned limitations, we propose a self-explained and self-supervised graph neural network (SES) to bridge the gap between explainability and prediction. |
Zhenhua Huang; Kunhao Li; Shaojie Wang; Zhaohong Jia; Wentao Zhu; Sharad Mehrotra; |
| 217 | Efficient Cross-layer Community Search in Large Multilayer Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing multilayer community models suffer from two major limitations: 1) failure to identify informative communities with the most layers when a multilayer graph is associated with a large number of layers; 2) missing to distinguish the degree of connections in internal layers and cross-layers. To tackle the above limitations, this paper proposes a novel multilayer subgraph model called $(k, d)$-core. |
Longxu Sun; Xin Huang; Zheng Wu; Jianliang Xu; |
| 218 | Large Subgraph Matching: A Comprehensive and Efficient Approach for Heterogeneous Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods often prove inefficient for such tasks. To address this gap, we propose CSCE, which generates efficient plans for various problem settings. |
Hongtai Cao; Qihao Wang; Xiaodong Li; Matin Najafi; Kevin Chen–Chuan Chang; Reynold Cheng; |
| 219 | Adaptive Hypergraph Network for Trust Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Adaptive Hypergraph Network for Trust Prediction (AHNTP), a novel approach that improves trust prediction accuracy by using higher-order correlations. |
Rongwei Xu; Guanfeng Liu; Yan Wang; Xuyun Zhang; Kai Zheng; Xiaofang Zhou; |
| 220 | Bottom-up K-Vertex Connected Component Enumeration By Multiple Expansion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the reason and proposes that the local expansion should be reformulated as a Multiple vertex collaborative Expansion problem instead of the traditional Unitary Expansion (UE). |
Haoyu Liu; Yongcai Wang; Xiaojia Xu; Deying Li; |
| 221 | Wings: Efficient Online Multiple Graph Pattern Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes Wings — a distributed system for online multi-GPM. |
Guanxian Jiang; Yunjian Zhao; Yichao Li; Zhi Liu; Tatiana Jin; Wanying Zheng; Boyang Li; James Cheng; |
| 222 | SGCL: Semantic-aware Graph Contrastive Learning with Lipschitz Graph Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods cannot ensure that semantic-related nodes are preserved during graph augmentation, leading to performance degradation. To tackle this issue, we propose a novel approach called Semantic-aware Graph Contrastive Learning (SGCL), which can generate high-quality contrastive samples by only augmenting semantic-unrelated nodes so as to facilitate the performance of GCL on downstream tasks. |
Jinhao Cui; Heyan Chai; Xu Yang; Ye Ding; Binxing Fang; Qing Liao; |
| 223 | Accelerating Scalable Graph Neural Network Inference with Node-Adaptive Propagation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although existing Scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure, these methods still suffer from scalability issues when making inferences on unseen nodes, as the feature preprocessing requires the graph to be known and fixed. To further accelerate Scalable GNNs inference in this inductive setting, we propose an online propagation framework and two novel node-adaptive propagation methods that can customize the optimal propagation depth for each node based on its topological information and thereby avoid redundant feature propagation. |
Xinyi Gao; Wentao Zhang; Junliang Yu; Yingxia Shao; Quoc Viet Hung Nguyen; Bin Cui; Hongzhi Yin; |
| 224 | Graph Condensation for Inductive Node Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, the original large graph is still required in the inference stage to perform message passing to inductive nodes, resulting in substantial computational demands. To overcome this issue, we propose mapping-aware graph condensation (MCond), explicitly learning the one-to-many node mapping from original nodes to synthetic nodes to seamlessly integrate new nodes into the synthetic graph for inductive representation learning. |
Xinyi Gao; Tong Chen; Yilong Zang; Wentao Zhang; Quoc Viet Hung Nguyen; Kai Zheng; Hongzhi Yin; |
| 225 | Graphix: “One User’s JSON Is Another User’s Graph” Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Graphix and show how it enables property graph views of existing document data in AsterixDB, a Big Data management system boasting a partitioned-parallel query execution engine. |
Glenn Galvizo; Michael J. Carey; |
| 226 | CSM-TopK: Continuous Subgraph Matching with TopK Density Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new problem of CSM-TopK to compute $k$ matches of a given query graph with the highest densities over a dynamic weighted graph and prove it to be NP-hard. |
Chuchu Gao; Youhuan Li; Zhibang Yang; Xu Zhou; |
| 227 | Efficient Maximal Temporal Plex Enumeration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill the gap, in this paper, we propose a novel model called $(k,\ l)$ -plex, which is a vertex set that exists in no less than $I$ timestamps, at each of which the subgraph induced is a $k$-plex. |
Yanping Wu; Renjie Sun; Xiaoyang Wang; Ying Zhang; Lu Qin; Wenjie Zhang; Xuemin Lin; |
| 228 | Denoising High-Order Graph Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Regarding the second problem, we propose a multi-order graphs fusion method, which adaptively integrates graphs of varying orders by solving a convex problem. |
Yonghao Chen; Ruibing Chen; Qiaoyun Li; Xiaozhao Fang; Jiaxing Li; Wai Keung Wong; |
| 229 | A Revisit to Graph Neighborhood Cardinality Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of calculating the general neighborhood cardinality of each node $v$ in the graph, i.e., the sum of non-negative attribute values of the nodes in the $k$ -hop neighborhood of a node $v$. |
Pinghui Wang; Yuanming Zhang; Kuankuan Cheng; Junzhou Zhao; |
| 230 | Attributed Network Embedding in Streaming Style Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, due to storage limitations, old attributes will be discarded as new ones are generated, existing methods struggle to integrate the new attribute information into embeddings generated from old attributes. Therefore, we propose a novel ANE framework named SANE (Streaming-style ANE), featuring a “memory” capability – that is, when updating the embeddings for new attributes, old attribute information can be partly preserved. |
Anbiao Wu; Ye Yuan; Changsheng Li; Yuliang Ma; Hao Zhang; |
| 231 | Faster Depth-First Subgraph Matching on GPUs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: They also use hardcoded fixed space for stacks that is determined ad-hoc and may lead to inaccuracy when the allocated space is insufficient. In this paper, we use subgraph matching as a case study to propose novel depth-first GPU solutions to address the above problems. |
Lyuheng Yuan; Da Yan; Jiao Han; Akhlaque Ahmad; Yang Zhou; Zhe Jiang; |
| 232 | G2-AIMD: A Memory-Efficient Subgraph-Centric Framework for Efficient Subgraph Finding on GPUs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present $\mathbf{G}^{2}$ -AIMD, a subgraph-centric framework for efficient subgraph Search on GPUs, which enjoys the efficiency of BFS on the search-space tree, while avoids intermediate subgraph-size explosion with novel system designs such as adaptive chunk-size adjustment and host-memory subgraph buffering, inspired by the additive-increase/multiplicative-decrease (AIMD) algorithm in TCP congestion control. |
Lyuheng Yuan; Akhlaque Ahmad; Da Yan; Jiao Han; Saugat Adhikari; Xiaodong Yu; Yang Zhou; |
| 233 | Fine-Grained Anomaly Detection on Dynamic Graphs Via Attention Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel unsupervised anomaly detection method for dynamic graphs. |
Dong Chen; Xiang Zhao; Weidong Xiao; |
| 234 | Accelerating Biclique Counting on GPU Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce GBC (GPU-based Biclique Counting), a novel approach designed to enable efficient and scalable ($p$, q)-biclique counting on GPUs. |
Linshan Qiu; Zhonggen Li; Xiangyu Ke; Lu Chen; Yunjun Gao; |
| 235 | GPU-Accelerated Batch-Dynamic Subgraph Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Surprisingly, systematic exploration of subgraph matching in the context of batch-dynamic graphs, particularly on a GPU platform, remains untouched. In this paper, we bridge this gap by introducing an efficient framework, GAMMA (GPU-Accelerated Batch-Dynamic Subgraph Matching). |
Linshan Qiu; Lu Chen; Hailiang Jie; Xiangyu Ke; Yunjun Gao; Yang Liu; Zetao Zhang; |
| 236 | I/O Efficient Max-Truss Computation in Large Static and Dynamic Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the problem of finding the $k_{\max}$-truss in external memory settings. |
Jiaqi Jiang; Qi Zhang; Rong-Hua Li; Qiangqiang Dai; Guoren Wang; |
| 237 | Efficient Multi-Query Oriented Continuous Subgraph Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose MQ-Match, an efficient approach to MQCSM. |
Ziyi Ma; Jianye Yang; Xu Zhou; Guoqing Xiao; Jianhua Wang; Liang Yang; Kenli Li; Xuemin Lin; |
| 238 | Label Constrained Reachability Queries on Time Dependent Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a formal definition of time-dependent label-constrained reachability (TDLCR) queries based on LCR. |
Yishu Wang; Jinlong Chu; Ye Yuan; Yu Gu; Hangxu Ji; Hao Zhang; |
| 239 | Time-Constrained Continuous Subgraph Matching Using Temporal Information for Filtering and Backtracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study time-constrained continuous subgraph matching, which detects a pattern with a strict partial order on the edge set in real-time whenever a temporal data graph changes over time. |
Seunghwan Min; Jihoon Jang; Kunsoo Park; Dora Giammarresi; Giuseppe F. Italiano; Wook-Shin Han; |
| 240 | Adaptive Truss Maximization on Large Graphs: A Minimum Cut Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on partial conversion strategy, we revisit the problem of truss maximization in this paper and propose adaptive solutions by achieving more new k-truss edges. |
Zitan Sun; Xin Huang; Chengzhi Piao; Cheng Long; Jianliang Xu; |
| 241 | SACH: Significant-Attributed Community Search in Heterogeneous Information Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of high-importance community search in HINs. |
Yanghao Liu; Fangda Guo; Bingbing Xu; Peng Bao; Huawei Shen; Xueqi Cheng; |
| 242 | TimeSGN: Scalable and Effective Temporal Graph Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We theoretically demonstrate that the DT-MP paradigm can reduce GPU memory usage compared to existing T-GNNs. Building on this foundation, we propose TimeSGN, a scalable and effective temporal graph neural network, which can handle billion-scale dynamic graphs. |
Yuanyuan Xu; Wenjie Zhang; Ying Zhang; Maria Orlowska; Xuemin Lin; |
| 243 | Variable-Length Path Query Evaluation Based on Worst-Case Optimal Joins Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel solution for efficient evaluation of variable-length path queries, based on worst-case optimal joins. |
Mingdao Li; Peng Peng; Zheyuan Hu; Lei Zou; Zheng Qin; |
| 244 | NewSP: A New Search Process for Continuous Subgraph Matching Over Dynamic Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we address the problem of unnecessary computations in traditional continuous subgraph matching (CSM) frameworks due to premature expansions of the search space in dynamic graphs. |
Ziming Li; Youhuan Li; Xinhuan Chen; Lei Zou; Yang Li; Xiaofeng Yang; Hongbo Jiang; |
| 245 | Querying Cohesive Subgraph Regarding Span-Constrained Triangles on Temporal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to address the $(k, \delta)$ -truss query, we propose both index-free and index-based approaches. |
Chuhan Hu; Ming Zhong; Yuanyuan Zhu; Tieyun Qian; Ting Yu; Hongyang Chen; Mengchi Liu; Jeffrey X. Yu; |
| 246 | Generating Robust Counterfactual Witnesses for Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new class of explanation structures, called robust counterfactual witnesses (RCWs), to provide robust, both counterfactual and factual explanations for graph neural networks. |
Dazhuo Qiu; Mengying Wang; Arijit Khan; Yinghui Wu; |
| 247 | Generative and Contrastive Paradigms Are Complementary for Graph Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For graph self-supervised learning (GSSL), masked autoencoder (MAE) follows the generative paradigm and learns to reconstruct masked graph edges or node features while contrastive learning (CL) maximizes the similarity between augmented views of the same graph. Existing works utilize MAE and CL separately but we observe that the MAE and CL paradigms are complementary and propose the graph contrastive masked autoencoder (GCMAE) framework to unify them. |
Yuxiang Wang; Xiao Yan; Chuang Hu; Quanqing Xu; Chuanhui Yang; Fangcheng Fu; Wentao Zhang; Hao Wang; Bo Du; Jiawei Jiang; |
| 248 | FedMix: Boosting with Data Mixture for Vertical Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we provide a theoretical analysis to show that unaligned data actually contains valuable and rich features, and a thoughtful design that harnesses the potential of unaligned samples to significantly improve the performance of VFL models. |
Yihang Cheng; Lan Zhang; Junyang Wang; Xiaokai Chu; Dongbo Huang; Lan Xu; |
| 249 | DMRNet: Effective Network for Accurate Discharge Medication Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, this is less emphasized by current automatic medication recommendation methods. To tackle the above challenges, we propose a three-module Discharge Medication Recommendation Network, called DMRNet, for accurate discharge medication recommendations. |
Jiyun Shi; Yuqiao Wang; Chi Zhang; Zhaojing Luo; Chengliang Chai; Meihui Zhang; |
| 250 | BClean: A Bayesian Data Cleaning System Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose BClean, a Bayesian Cleaning system that features automatic Bayesian network construction and user interaction. |
Jianbin Qin; Sifan Huang; Yaoshu Wang; Jing Zhu; Yifan Zhang; Yukai Miao; Rui Mao; Makoto Onizuka; Chuan Xiao; |
| 251 | MultiEM: Efficient and Effective Unsupervised Multi-Table Entity Matching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unfortunately, effective and efficient unsupervised multi-table EM remains under-explored. To fill this gap, this paper formally studies the problem of unsupervised multi-table entity matching and proposes an effective and efficient solution, termed as MultiEM. |
Xiaocan Zeng; Pengfei Wang; Yuren Mao; Lu Chen; Xiaoze Liu; Yunjun Gao; |
| 252 | A Critical Re-evaluation of Record Linkage Benchmarks for Learning-Based Matching Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the quality of the benchmark datasets typically used in the experimental evaluations of learning-based matching algorithms has not been examined in the literature. To cover this gap, we propose four complementary approaches to assessing the difficulty and appropriateness of 13 commonly used datasets: two theoretical ones, which involve new measures of linearity and existing measures of complexity, and two practical ones – the difference between the best non-linear and linear matchers, as well as the difference between the best learning-based matcher and the perfect oracle. |
George Papadakis; Nishadi Kirielle; Peter Christen; Themis Palpanas; |
| 253 | Online Query-Based Data Pricing with Time-Discounting Valuations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the query feature-based data pricing problem with unknown time-discounting data valuation. |
Yicheng Fu; Xiaoye Miao; Huanhuan Peng; Chongning Na; Shuiguang Deng; Jianwei Yin; |
| 254 | Representation Learning for Entity Alignment in Knowledge Graph: A Design Space Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, real-world applications are often with noisy and missing data, which introduces complexities for EA tasks. To address this, we propose a new benchmark that explores the design space of EA framework, which consists of the embedding, relation, attribute and alignment module. |
Peng Huang; Meihui Zhang; Ziyue Zhong; Chengliang Chai; Ju Fan; |
| 255 | Fairness-Aware Data Preparation for Entity Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the substantial body of research that focuses on improving the effectiveness of entity matching, enhancing its fairness has received scant attention. To fill this gap, this paper introduces a new problem of preparing fairness-aware datasets for entity matching. |
Nima Shahbazi; Jin Wang; Zhengjie Miao; Nikita Bhutani; |
| 256 | Mitigating Data Sparsity in Integrated Data Through Text Conceptualization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We approach the problem from a textual information extraction perspective and propose to conceptualize external documents using the concepts in the integrated schema. |
Md Ataur Rahman; Sergi Nadal; Oscar Romero; Dimitris Sacharidis; |
| 257 | Measuring Approximate Functional Dependencies: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on this analysis, we give clear recommendations for the AFD measures to use in practice. |
Marcel Parciak; Sebastiaan Weytjens; Niel Hens; Frank Neven; Liesbet M. Peeters; Stijn Vansummeren; |
| 258 | Efficient Relaxed Functional Dependency Discovery with Minimal Set Cover Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the efficient discovery of RFDs, this paper proposes a novel mining method to supplement the current research gaps. |
Xiaoou Ding; Yida Liu; Hongzhi Wang; Chen Wang; Yichen Song; Donghua Yang; Jianmin Wang; |
| 259 | Gen-T: Table Reclamation in Data Lakes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Unlike query discovery problems like Query-by-Example or by-Target, Table Reclamation focuses on reclaiming the data in the Source Table as fully as possible using real tables that may be incomplete or inconsistent. To do this, we define a new measure of table similarity, called error-aware instance similarity, to measure how close a reclaimed table is to a Source Table, a measure grounded in instance similarity used in data exchange. |
Grace Fan; Roee Shraga; Renée J. Miller; |
| 260 | Discovering Denial Constraints in Dynamic Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an efficient and flexible algorithm that covers the earlier limitations regarding performance and scope. |
Eduardo H. M. Pena; Fabio Porto; Felix Naumann; |
| 261 | Towards Semantic Consistency: Dirichlet Energy Driven Robust Multi-Modal Entity Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a generalizable theoretical principle by examining semantic consistency from the perspective of Dirichlet energy. |
Yuanyi Wang; Haifeng Sun; Jiabo Wang; Jingyu Wang; Wei Tang; Qi Qi; Shaoling Sun; Jianxin Liao; |
| 262 | Share: Stackelberg-Nash Based Data Markets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Stackelberg-Nash based Data Markets (Share) to first realize a demand-driven incentivized data market with absolute pricing. |
Yuran Bi; Jinfei Liu; Chen Zhao; Junyi Zhao; Kui Ren; Li Xiong; |
| 263 | Interactive Trimming Against Evasive Online Data Manipulation Attacks: A Game-Theoretic Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an interactive game-theoretical model to defend online data manipulation attacks using the trimming strategy. |
Yue Fu; Qingqing Ye; Rong Du; Haibo Hu; |
| 264 | Label Noise Correction for Federated Learning: A Secure, Efficient and Reliable Realization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present zkCor, an efficient and reliable label noise correction scheme with zero-knowledge confidentiality. |
Haodi Wang; Tangyu Jiang; Yu Guo; Fangda Guo; Rongfang Bie; Xiaohua Jia; |
| 265 | Mitigating Data Scarcity in Supervised Machine Learning Through Reinforcement Learning Guided Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it is important to note that because the provided training data may exhibit a different data distribution compared to the validation (or unseen testing) data, the generative model learned from these seen training data cannot guarantee the generation of high-quality data relative to this ML task. To address this challenge, we introduce an iterative approach that gradually calibrates the generative model by interacting with an environment that tells whether generated tuples are good or bad, by using a validation dataset that is not exposed to the generative model. |
Chengliang Chai; Kasisen Jin; Nan Tang; Ju Fan; Lianpeng Qiao; Yuping Wang; Yuyu Luo; Ye Yuan; Guoren Wang; |
| 266 | Dual-Teacher De-Biasing Distillation Framework for Multi-Domain Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing methods are dedicated to improving the overall performance of fake news detection, ignoring the fact that unbalanced data leads to disparate treatment for different domains, i.e., the domain bias problem. To solve this problem, we propose the Dual-Teacher De-biasing Distillation framework (DTDBD) to mitigate bias across different domains. |
Jiayang Li; Xuan Feng; Tianlong Gu; Liang Chang; |
| 267 | Are There Fundamental Limitations in Supporting Vector Data Management in Relational Databases? A Case Study of PostgreSQL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to answer this question. We chose PostgreSQL as a representative relational database due to its popularity. |
Yunan Zhang; Shige Liu; Jianguo Wang; |
| 268 | Compression and In-Situ Query Processing for Fine-Grained Array Lineage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces DSLog, a storage system that efficiently stores, indexes, and queries array data lineage, agnostic to capture methodology. |
JinJin Zhao; Sanjay Krishnan; |
| 269 | TSDDISCOVER: Discovering Data Dependency for Time Series Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In recognition of the obvious characteristics inherent in time series data, we introduce a novel data dependency, termed TSDD. |
Xiaoou Ding; Yingze Li; Hongzhi Wang; Chen Wang; Yida Liu; Jianmin Wang; |
| 270 | Time Series Data Cleaning Under Expressive Constraints on Both Rows and Columns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the challenges, we propose a novel data cleaning method for time series which incorporates expressive constraints that support arithmetic operations between attributes and time context. |
Xiaoou Ding; Genglong Li; Hongzhi Wang; Chen Wang; Yichen Song; |
| 271 | Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the problem, in this paper, we provide a comprehensive study to investigate how to develop a cost-effective batch prompting approach to ER. |
Meihao Fan; Xiaoyue Han; Ju Fan; Chengliang Chai; Nan Tang; Guoliang Li; Xiaoyong Du; |
| 272 | A Multi-Task Learning Framework for Reading Comprehension of Scientific Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous studies have focused on scientific tables, but they are limited to individual modules or tasks and lack a comprehensive framework. To address these issues, we introduce a reading comprehension framework for scientific tables, named NRTR, which uses a multi-task learning approach that shares a common encoder, achieves reasoning across various tasks, including question answering, cloze testing, and fact verification. |
Xu Yang; Meihui Zhang; Ju Fan; Zeyu Luo; Yuxin Yang; |
| 273 | Enabling Efficient NVM-Based Text Analytics Without Decompression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose N-TADOC, which substitutes DRAM with NVM while maintaining TADOC’s analytics performance and space savings. |
Xiaokun Fang; Feng Zhang; Junxiang Nong; Mingxing Zhang; Puyun Hu; Yunpeng Chai; Xiaoyong Du; |
| 274 | F-TADOC: FPGA-Based Text Analytics Directly on Compression with HLS Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose FPGA-based text analytics directly on compression with HLS, namely F – TADOC, which is the first framework using HLS to provide FPGA-based text analytics directly on compressed data. |
Yanliang Zhou; Feng Zhang; Tuo Lin; Yuanjie Huang; Saiqin Long; Jidong Zhai; Xiaoyong Du; |
| 275 | Robust External Hash Aggregation in The Solid State Age Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We revisit external hash aggregation on modern hardware, aiming instead for robust performance that avoids a “performance cliff” when memory runs out. To achieve this, we introduce two techniques for handling temporary query intermediates. |
Laurens Kuiper; Peter Boncz; Hannes Mühleisen; |
| 276 | Neos: A NVMe-GPUs Direct Vector Service Buffer in User Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a vector buffer engine, Neos. |
Yuchen Huang; Xiaopeng Fan; Song Yan; Chuliang Weng; |
| 277 | TEngine: A Native Distributed Table Storage Engine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose TEngine, a native distributed table storage engine designed for NVMe SSD and RDMA. |
Xiaopeng Fan; Song Yan; Yuchen Huang; Chuliang Weng; |
| 278 | DmRPC: Disaggregated Memory-aware Datacenter RPC for Data-intensive Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present DmRPC, a DM-aware datacenter RPC for data-intensive datacenter applications to our knowledge. |
Jie Zhang; Xuzheng Chen; Yin Zhang; Zeke Wang; |
| 279 | RapidGKC: GPU-Accelerated K-Mer Counting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these solutions under-utilize the GPU parallelism because the encoding format of intermediate data forces sequential decoding. To address this problem, we design a new encoding scheme for variable-length genomic data to support parallel encoding and decoding. |
Yiran Cheng; Xibo Sun; Qiong Luo; |
| 280 | Sylvie: 3D-Adaptive and Universal System for Large-Scale Graph Neural Network Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, existing works fail to comprehensively consider diverse opportunities for acceleration. Motivated by these deficiencies, we propose Sylvie,a full-graph training system that not only improves the training throughput substantially but also maintains the model quality for universal GNNs. |
Meng Zhang; Qinghao Hu; Cheng Wan; Haozhao Wang; Peng Sun; Yonggang Wen; Tianwei Zhang; |
| 281 | UltraPrecise: A GPU-Based Framework for Arbitrary-Precision Arithmetic in Database Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a design and implementation of a framework called UltraPrecise which supports arbitrary-precision arithmetic for databases on GPU, aiming to gain high performance for arbitrary-precision arithmetic operations. |
Xin Li; Mengbai Xiao; Dongxiao Yu; Rubao Lee; Xiaodong Zhang; |
| 282 | Exploiting Persistent CPU Cache for Scalable Persistent Hash Index Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Spash, a highly scalable persistent hash index for PM systems with persistent CPU cache. |
Bowen Zhang; Shengan Zheng; Liangxu Nie; Zhenlin Qi; Linpeng Huang; Hong Mei; |
| 283 | LTPG: Large-Batch Transaction Processing on GPUs with Deterministic Concurrency Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes LTPG, a novel GPU-enabled database system that offers increased versatility and efficiency by eliminating the need for predefined read/write-sets. |
Jianpeng Wei; Yu Gu; Tianyi Li; Jianzhong Qi; Chuanwen Li; Yanfeng Zhang; Christian S. Jensen; Ge Yu; |
| 284 | Why Files If You Have A DBMS? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the former, we present a new BLOB allocation and logging design that exhibits lower write amplification, reduces WAL checkpointing frequency, and consumes less storage than the conventional strategies. |
Lam-Duy Nguyen; Viktor Leis; |
| 285 | STEM: Streaming-Based FPGA Acceleration for Large-Scale Compactions in LSM KV Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Compaction throughput significantly degrades with larger inputs, leading to frequent write stalls and decrement in overall write throughput. This paper proposes STEM, a stream-based compaction framework with FPGA to address this issue. |
Dongdong Tang; Weilan Wang; Yu Mao; Jinghuan Yu; Tei-Wei Kuo; Chun Jason Xue; |
| 286 | Improving The Relationship Between B+-Tree and Memory Allocator for Persistent Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prevalent solutions in both the allocators and B +- Trees are confronted with similar dilemmas between runtime performance and recovery efficiency. In this paper, we propose a novel crash-consistent protocol called ST-protocol (share and talk protocol) to address these dilemmas. |
Wei Yan; Xingjun Zhang; |
| 287 | Accelerating Aggregation Using A Real Processing-in-Memory System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage the commercially available real UPMEM PIM system to accelerate the execution of the aggregation operator, which is data-intensive and involves large amounts of data movements. |
Muhammad Attahir Jibril; Hani Al-Sayeh; Kai-Uwe Sattler; |
| 288 | CLIMBER: Pivot-Based Approximate Similarity Search Over Big Data Series Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: State-of-the-art systems DPiSAX and TARDIS report accuracy below 10% and 40%, respectively, which is not practical for many real-world applications. In this paper, we investigate the root problems in these existing techniques that limit their ability to achieve better a trade-off between scalability and accuracy. |
Liang Zhang; Mohamed Y. Eltabakh; Elke A. Rundensteiner; Khalid Alnuaim; |
| 289 | Hill-Cache: Adaptive Integration of Recency and Frequency in Caching with Hill-Climbing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we theoretically investigated how parameters impact hit rates in LRFU and discovered two distinct features named unimodality and correlation. |
Yunfan Li; Huiqi Hu; Chaojing Lei; Xuan Zhou; Weining Qian; |
| 290 | Efficient Approximate Maximum Inner Product Search Over Sparse Vectors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The solutions to MIPS over sparse vectors rely heavily on inverted lists, resulting in poor query efficiency, particularly when dealing with large-scale sparse datasets. In this paper, we introduce SOSIA, a novel framework specifically tailored to address these limitations. |
Xi Zhao; Zhonghan Chen; Kai Huang; Ruiyuan Zhang; Bolong Zheng; Xiaofang Zhou; |
| 291 | Riveter: Adaptive Query Suspension and Resumption Framework for Cloud Native Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Addressing this challenge requires the design and implementation of query suspension and resumption with a mechanism that can adaptively determine when, if, and how to suspend queries. In this paper, we propose Riveter, a query suspension and resumption framework that can adaptively pause ongoing queries using various strategies, including (1) a redo strategy that terminates queries and subsequently re-runs them, (2) a pipeline-level strategy that suspends a query once one of its pipelines has completed to reduce the storage requirements for intermediate data, (3) and a process-level strategy that enables the suspension of query execution processes at any given moment but generates a substantial volume of intermediate data for query resumption. |
Rui Liu; Aaron J. Elmore; Michael J. Franklin; Sanjay Krishnan; |
| 292 | Mirage: Generating Enormous Databases for Complex Workloads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a new generator Mirage supporting well for complex operators with low error bounds for cardinality constraints. |
Qingshuai Wang; Hao Li; Zirui Hu; Rong Zhang; Chengcheng Yang; Peng Cai; Xuan Zhou; Aoying Zhou; |
| 293 | Joint Directory, File and IO Trace Feature Extraction and Feature-based Trace Regeneration for Enterprise Storage Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing works primarily focus on I/O trace modeling and regeneration without considering the directory/file access information. In this paper, we propose a new technique, called Sketcher, that can sketch massive traces into highly compressed “joint features” with both directory/file and I/O characteristics, and then based on these features regenerate high-fidelity traces with a learning-based approach. |
Kecheng Huang; Xijun Li; Mingxuan Yuan; Ji Zhang; Zili Shao; |
| 294 | Robust Auto-Scaling with Probabilistic Workload Forecasting for Cloud Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the inherent inaccuracy of forecasting presents a significant challenge, potentially causing resource under-provisioning. To address this challenge, we propose robust predictive auto-scaling that considers the uncertainty in forecasts. |
Haitian Hang; Xiu Tang; Jianling Sun; Lingfeng Bao; David Lo; Haoye Wang; |
| 295 | CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we implement three checkpointing approaches that we surveyed and adapted for the distinct needs of streaming dataflows. |
George Siachamis; Kyriakos Psarakis; Marios Fragkoulis; Arie van Deursen; Paris Carbone; Asterios Katsifodimos; |
| 296 | Benchtemp: A General Benchmark for Evaluating Temporal Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To handle graphs in which features or connections are evolving over time, a series of temporal graph neural networks (TGNNs) have been proposed. |
Qiang Huang; Xin Wang; Susie Xi Rao; Zhichao Han; Zitao Zhang; Yongjun He; Quanqing Xu; Yang Zhao; Zhigao Zheng; Jiawei Jiang; |
| 297 | Fast Query Answering By Labeling Index on Uncertain Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these techniques either struggle with a significant efficiency-accuracy trade-off or lack generalization over different graphs and queries. To circumvent these limitations, this work proposes a novel index-based method for query answering on UGs. |
Zeyu Wang; Qihao Shi; Jiawei Chen; Can Wang; Mingli Song; Xinyu Wang; |
| 298 | Scavenger: Better Space-Time Trade-Offs for Key-Value Separated LSM-trees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically analyze the sources of space amplification of KV- separated LSM-trees and introduce Scavenger, which achieves a better trade-off between performance and space amplification. |
Jianshun Zhang; Fang Wang; Sheng Qiu; Yi Wang; Jiaxin Ou; Junxun Huang; Baoquan Li; Peng Fang; Dan Feng; |
| 299 | A Spatio-Temporal Series Data Model with Efficient Indexing and Layout for Cloud-Based Trajectory Data Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel cloud-based trajectory data management technique, called Springbok, to bridge the gap between massive trajectory data and cloud storage. |
Yang Guo; Zhiqi Wang; Jin Xue; Zili Shao; |
| 300 | Reverse Regret Query Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we directly utilize scores to evaluate products, enabling more accurate identification of prospective customers. |
Weicheng Wang; Raymond Chi-Wing Wong; H. V. Jagadish; Min Xie; |
| 301 | Resistance Eccentricity in Graphs: Distribution, Computation and Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we devise a near-linear time algorithm to approximate the resistance eccentricity for one or multiple given nodes, accompanied by a theoretically guaranteed error bound. |
Zenan Lu; Xiaotian Zhou; Ahad N. Zehmakan; Zhongzhi Zhang; |
| 302 | BushStore: Efficient B+Tree Group Indexing for LSM-Tree in Non-Volatile Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: And the previous NVM-enhanced LSM-Tree also ignores the sensitivity of NVM to small-grained random reads and writes, which we believe is the key to further improving read and write performance. To address these issues, we propose BushStore, an innovative LSM-Tree variant specifically optimized for NVM. |
Zhenghao Wang; Lidan Shou; Ke Chen; Xuan Zhou; |
| 303 | Cross Online Ride-Sharing for Multiple-Platform Cooperations in Spatial Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The inter-platform collaborations on ride-sharing can ease the worker shortages and greatly improve the service quality, but have not been studied yet. In this paper, we propose a Cross Online Ride-sharing (CORS) problem, which allows a platform to borrow the available workers from other platforms to serve its own requests. |
Yurong Cheng; Zhaohe Liao; Xiaosong Huang; Yi Yang; Xiangmin Zhou; Ye Yuan; Guoren Wang; |
| 304 | Cooperative Air-Ground Instant Delivery By UAVs and Crowdsourced Taxis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, the delivery detour of taxis and the limited battery of UAVs make it hard to meet the rapidly increasing instant delivery demands. Under this circumstance, this paper proposes an air-ground cooperative instant delivery paradigm to maximize the delivery performance and meanwhile minimize the negative effects on the taxi passengers. |
Junhui Gao; Qianru Wang; Xin Zhang; Juan Shi; Xiang Zhao; Qingye Han; Yan Pan; |
| 305 | Urban Sensing for Multi-Destination Workers Via Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing spatial crowdsourcing methods are only designed for workers who have single destinations, e.g., commuters, which are not applicable to recruit the multiple-destination people. Therefore, in this paper, we generalize the urban crowdsensing problem to the multi-destination scenario, namely, Urban Sensing for Multi-Destination Workers (USMDW). |
Shuliang Wang; Song Tang; Sijie Ruan; Cheng Long; Yuxuan Liang; Qi Li; Ziqiang Yuan; Jie Bao; Yu Zheng; |
| 306 | Semi-Asynchronous Online Federated Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, canonical crowdsourcing systems mostly need to aggregate/transmit worker data and may lead to privacy-leakage. To tackle this problem, we propose a novel approach, called FedCS (Federated CrowdSourcing), to achieve privacy protection while ensuring quality. |
Xiangping Kang; Guoxian Yu; Qingzhong Li; Jun Wang; Hui Li; Carlotta Domeniconi; |
| 307 | FUDJ: Flexible User-Defined Distributed Joins Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FUDJ, Flexible User-defined Distributed Joins, a framework for complex distributed join algorithms. |
Akil Sevim; Ahmed Eldawy; E. Preston Carman; Michael J. Carey; Vassilis J. Tsotras; |
| 308 | IVE: Accelerating Enumeration-Based Subgraph Matching Via Exploring Isolated Vertices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel subgraph matching algorithm called the Isolated Vertices Exploration (IVE). |
Zite Jiang; Shuai Zhang; Xingzhong Hou; Mengting Yuan; Haihang You; |
| 309 | Approximate Skyline Index for Constrained Shortest Pathfinding with Theoretical Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a-FHL, a practical approximation method designed to circumvent the costly skyline path search and hasten computation on skyline path indexing. |
Ziyi Liu; Lei Li; Mengxuan Zhang; Wen Hua; Xiaofang Zhou; |
| 310 | CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The balanced performance and recall of graph-based approaches have more recently garnered significant attention in ANNS algorithms, however, only a few studies have explored harnessing the power of GPUs and multi-core processors despite the widespread use of massively parallel and general-purpose computing. To bridge this gap, we introduce a novel parallel computing hardware-based proximity graph and search algorithm. |
Hiroyuki Ootomo; Akira Naruse; Corey Nolet; Ray Wang; Tamas Feher; Yong Wang; |
| 311 | VisionEmbedder: Bit-Level-Compact Key-Value Storage with Constant Lookup, Rapid Updates, and Rare Failure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these solutions have proven their worth in distributed storage, networking, and bioinformatics, they still face two significant issues: one is that their space cost could be further reduced; the other is their are vulnerable to update failures, which can necessitate a complete table reconstruction. To address these issues, we introduce VisionEmbedder, a compact key-value embedder with constant-time lookup, fast dynamic updates, and a near-zero risk of reconstruction. |
Yuhan Wu; Feiyu Wang; Yifan Zhu; Zhuochen Fan; Zhiting Xiong; Tong Yang; Bin Cui; |
| 312 | Efficient Reverse $k$ Approximate Nearest Neighbor Search Over High-Dimensional Vectors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing $\mathbf{R}k\mathbf{NNS}$ solutions face inefficiency when handling large-scale high-dimensional vectors due to their sensitivity to data dimensions and sizes during index construction or the verification of numerous candidate results in the query phase. Motivated by these challenges and the inherent intricacies of high-dimensional data processing, in this paper, we study an approximate version of the $\mathbf{R}k\mathbf{NNS}$ problem $(\mathbf{R}k\mathbf{ANNS})$ for high-dimensional vectors, aiming to offer efficient and practical solutions. |
Yitong Song; Kai Wang; Bin Yao; Zhida Chen; Jiong Xie; Feifei Li; |
| 313 | HJG: An Effective Hierarchical Joint Graph for ANNS in Multi-Metric Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, for the first time, we study the approximate nearest neighbour search (ANNS) in multi-metric spaces, and propose HJG, a hierarchical joint graph, to solve the multi-metric query efficiently and effectively. |
Yifan Zhu; Lu Chen; Yunjun Gao; Ruiyao Ma; Baihua Zheng; Jingwen Zhao; |
| 314 | Dynamic Data Layout Optimization with Worst-Case Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present an algorithmic framework OReO that makes online reorganization decisions to balance the benefits of improved query performance with the costs of reorganization. |
Kexin Rong; Paul Liu; Sarah Ashok Sonje; Moses Charikar; |
| 315 | QCFE: An Efficient Feature Engineering for Query Cost Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: (2) We propose a difference-propagation feature reduction method for query cost estimation to filter the ineffective features. |
Yu Yan; Hongzhi Wang; Junfang Huang; Dake Zhong; Tao Yu; Kaixin Zhang; Man Yang; Tianqing Wang; |
| 316 | Chameleon: Towards Update-Efficient Learned Indexing for Locally Skewed Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Frequent model retraining and reconstruction is required under this circumstance. To address this issue, we present Chameleon, an adaptive learned index for locally skewed data especially in the context of frequent updates. |
Na Guo; Yaqi Wang; Wenli Sun; Yu Gu; Jianzhong Qi; Zhenghao Liu; Xiufeng Xia; Ge Yu; |
| 317 | FOSS: A Self-Learned Doctor for Query Optimizer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these methods have achieved some success, they face challenges in either low training efficiency or limited plan search space. To address these challenges, we introduce FOSS, a novel framework for query optimization based on deep reinforcement learning. |
Kai Zhong; Luming Sun; Tao Ji; Cuiping Li; Hong Chen; |
| 318 | MFIX: An Efficient and Reliable Index Advisor Via Multi-Fidelity Bayesian Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a multi-fidelity index advisor, MFIX, designed to reconcile search efficiency and solution quality. |
Zhuo Chang; Xinyi Zhang; Yang Li; Xupeng Miao; Yanzhao Qin; Bin Cui; |
| 319 | VDTuner: Automated Performance Tuning for Vector Data Management Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce VDTuner, a learning-based automatic performance tuning framework for VDMS, leveraging multi-objective Bayesian optimization. |
Tiannuo Yang; Wen Hu; Wangqi Peng; Yusen Li; Jianguo Li; Gang Wang; Xiaoguang Liu; |
| 320 | TrendSharing: A Framework to Discover and Follow The Trends for Shared Mobility Services Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework, TrendSharing, to minimize the total tardiness when serving all tasks. |
Jiexi Zhan; Han Wu; Peng Cheng; Libin Zheng; Lei Chen; Chen Jason Zhang; Xuemin Lin; Wenjie Zhang; |
| 321 | Collectively Simplifying Trajectories in A Database: A Query Accuracy Driven Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we propose a multi-agent reinforcement learning based solution with two agents working cooperatively to collectively simplify trajectories in a database while optimizing query usability. |
Zheng Wang; Cheng Long; Gao Cong; Christian S. Jensen; |
| 322 | Efficient Learning-based Top-k Representative Similar Subtrajectory Query Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the challenge of high computational costs, we propose a learning-based framework, leveraging a deep learning model called Representative Similarity Score Estimation (RSSE) to approximate subtrajectory similarity scores efficiently and reduce the candidate set significantly. |
Kunming Wang; Shiyu Yang; Jiabao Jin; Peng Cheng; Jianye Yang; Xuemin Lin; |
| 323 | Urban Region Representation Learning with Attentive Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The typical fusion methods rely on simple aggregation, such as summation and concatenation, thereby disregarding correlations within the fused region embeddings. To address this limitation, we propose a novel model named HAFusion. |
Fengze Sun; Jianzhong Qi; Yanchuan Chang; Xiaoliang Fan; Shanika Karunasekera; Egemen Tanin; |
| 324 | LightTR: A Lightweight Framework for Federated Trajectory Recovery Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To bridge the gap between decentralized training and trajectory recovery, we propose a lightweight framework, LightTR, for federated trajectory recovery based on a client-server architecture, while keeping the data decentralized and private in each client/platform center (e.g., each data center of a company). |
Ziqiao Liu; Hao Miao; Yan Zhao; Chenxi Liu; Kai Zheng; Huan Li; |
| 325 | Learning Time-Aware Graph Structures for Spatially Correlated Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing methods rely on predefined or self-learning graphs, which are either static or unintentionally dynamic, and thus cannot model the time-varying correlations that exhibit trends and periodicities caused by the regularity of the underlying processes in CPS. To tackle such limitation, we propose Time-aware Graph Structure Learning (TagSL), which extracts time-aware correlations among time series by measuring the interaction of node and time representations in high-dimensional spaces. |
Minbo Ma; Jilin Hu; Christian S. Jensen; Fei Teng; Peng Han; Zhiqiang Xu; Tianrui Li; |
| 326 | Deep Dirichlet Process Mixture Model for Non-parametric Trajectory Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed Tra-jDPM, an end-to-end framework for non-parametric trajectory clustering. |
Di Yao; Jin Wang; Wenjie Chen; Fangda Guo; Peng Han; Jingping Bi; |
| 327 | Parameterized Decision-Making with Multi-Modality Perception for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods either ignore the complexity of environments only fitting straight roads, or ignore the impact on surrounding vehicles during optimization phases, leading to weak environmental adaptability and incomplete optimization objectives. To address these limitations, we propose a pArameterized decision-making framework with mU lti-modality percepTiOn based on deep reinforcement learning, called AUTO. |
Yuyang Xia; Shuncheng Liu; Quanlin Yu; Liwei Deng; You Zhang; Han Su; Kai Zheng; |
| 328 | CausalTAD: Causal Implicit Generative Model for Debiased Online Trajectory Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we define the debiased trajectory anomaly detection problem and propose a causal implicit generative model, namely CausalTAD, to solve it. |
Wenbin Li; Di Yao; Chang Gong; Xiaokai Chu; Quanliang Jing; Xiaolei Zhou; Yuxuan Zhang; Yunxia Fan; Jingping Bi; |
| 329 | Learning to Hash for Trajectory Similarity Computation and Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a learning to hash method for trajectory similarity computation and search, called Traj2Hash, which consists of a two-channel trajectory encoder and a hash layer to encode trajectories into Euclidean and Hamming space, respectively. |
Liwei Deng; Yan Zhao; Jin Chen; Shuncheng Liu; Yuyang Xia; Kai Zheng; |
| 330 | Ocean: Online Clustering and Evolution Analysis for Dynamic Streaming Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the problem of real-time clustering on streaming data in computation-intensive and high-dynamics tasks, through a framework Ocean, consisting of the Online clustering algorithm and evolution analysis. |
Chunhui Feng; Junhua Fang; Yue Xia; Pingfu Chao; Pengpeng Zhao; Jiajie Xu; Xiaofang Zhou; |
| 331 | SWISP: Distributed Convoy Mining Via Sliding Window-based Indexing and Sub-track Partitioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, on the basis of practical application scenarios, load balancing is an important consideration for distributed algorithms. To tackle the above challenges, we propose a novel method for distributed convoy mining via sliding window-based indexing and sub-track partitioning, abbreviated SWISP. |
Chenxu Wang; Xin Yang; Tianyi Li; Jiaxing Wei; Pinghui Wang; Hongzhen Xiang; Christian S. Jensen; |
| 332 | Querying Shortest Path on Large Time-Dependent Road Networks with Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the shortest path query over large-scale time-dependent road networks. |
Zengyang Gong; Yuxiang Zeng; Lei Chen; |
| 333 | FRESH: Towards Efficient Graph Queries in An Outsourced Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a generic framework called FRESH to handle various graph queries efficiently within a single outsourced graph. |
Kai Huang; Yunqi Li; Qingqing Ye; Yao Tian; Xi Zhao; Yue Cui; Haibo Hu; Xiaofang Zhou; |
| 334 | Managing The Future: Route Planning Influence Evaluation in Transportation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, such a paradigm could generate congestion and deteriorate traffic conditions because the routing algorithms are not aware of their results’ influence on the traffic flow. Therefore, in this paper, we identify this flaw in the current paradigm and propose a route data management system to evaluate the influence of the routing results and help improve future downstream tasks. |
Zizhuo Xu; Lei Li; Mengxuan Zhang; Yehong Xu; Xiaofang Zhou; |
| 335 | Scalable Distance Labeling Maintenance and Construction for Dynamic Small-World Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we adopt the Core-Tree index, which has exceptional scalability while preserving high query efficiency, as the underlying shortest path index, and put forward efficient algorithms to maintain and construct it for large dynamic small-world networks. |
Xinjie Zhou; Mengxuan Zhang; Lei Li; Xiaofang Zhou; |
| 336 | Congestion-Mitigating Spatiotemporal Routing in Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two solutions, Spatiotemporal Oblivious Routing (SOR) and Spatiotemporal Routing with History (SRH), which return routes based on the current and anticipated future traffic statuses, respectively, while offering theoretical guarantees. |
Libin Wang; Raymond Chi-Wing Wong; Christian S. Jensen; |
| 337 | A Just-In-Time Framework for Continuous Routing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we revisit the problem of the current routing system in terms of prediction scalability and routing result optimality. |
Jing Zhao; Lei Li; Mengxuan Zhang; Zihan Luo; Xi Zhao; Xiaofang Zhou; |
| 338 | QSRP: Efficient Reverse $k-\text{Ranks}$ Query Processing on High-Dimensional Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the reverse $k-\mathbf{ranks}$ query, which finds the users that are the most interested in a product and has many applications including product promotion, targeted advertising, and market analysis. |
Zheng Bian; Xiao Yan; Jiahao Zhang; Man Lung Yiu; Bo Tang; |
| 339 | FedCTQ: A Federated-Based Framework for Accurate and Efficient Contact Tracing Query Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we define the Federated Contact Tracing Query (F-CTQ) problem and propose the FedCTQ framework based on hierarchical federation. |
Zhihao Zeng; Ziquan Fang; Lu Chen; Yunjun Gao; Kai Zheng; Gang Chen; |
| 340 | Alleviating The Inconsistency of Multimodal Data in Cross-Modal Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We further validate the prevalent existence of inconsistent data in multimodal datasets and highlight it will reduce the accuracy of existing Cross-Modal retrieval methods. In this paper, we propose a novel framework called Inconsistency Alleviated Cross-Modal Retrieval (IA-CMR), addressing challenges posed by these inconsistencies. |
Tieying Li; Xiaochun Yang; Yiping Ke; Bin Wang; Yinan Liu; Jiaxing Xu; |
| 341 | Firzen: Firing Strict Cold-Start Items with Frozen Heterogeneous and Homogeneous Graphs for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a unified framework incorporating multi-modal content of items and KGs to effectively solve both strict cold-start and warm-start recommendation termed Firzen, which extracts the user-item collaborative information over frozen heterogeneous graph (collaborative knowledge graph), and exploits the item-item semantic structures and user-user behavioral association over frozen homogeneous graphs (item-item relation graph and user-user co-occurrence graph). |
Hulingxiao He; Xiangteng He; Yuxin Peng; Zifei Shan; Xin Su; |
| 342 | Reconsidering Tree Based Methods for K-Maximum Inner-Product Search: The LRUS-CoverTree Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In summary, our novel tree structure and new algorithm significantly improve upon existing tree based methods, and it is hoped that this contribution can lead to a reconsideration of tree based k-Maximum Inner-Product Search methods. |
Hengzhao Ma; Jianzhong Li; Yong Zhang; |
| 343 | Cross-Insight Trader: A Trading Approach Integrating Policies with Diverse Investment Horizons for Portfolio Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing RL-based solutions fail to consider the intrinsic causes behind this non-stationary, which primarily stem from the involvement of diverse traders with distinct investment horizons and their varied investment strategies. In this paper, we tackle the non-stationary problem by examining its intrinsic causes and propose cross-insight trader, a novel two-step RL-based approach that integrates multiple trading policies with different investment horizons to adapt to the changing market conditions. |
Zetao Zheng; Jie Shao; Shilong Deng; Anjie Zhu; Heng Tao Shen; Xiaofang Zhou; |
| 344 | Unsupervised Multimodal Graph Contrastive Semantic Anchor Space Dynamic Knowledge Distillation Network for Cross-Media Hash Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a dynamic knowledge distillation technique to transfer the multimodal semantic anchor space knowledge embedded in the multimodal large teacher model to the lightweight student model as much as possible. |
Yang Yu; Meiyu Liang; Mengran Yin; Kangkang Lu; Junping Du; Zhe Xue; |
| 345 | HIT: Solving Partial Index Tracking Via Hierarchical Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a hierarchical model for partial index tracking (HIT), which formulates PIT as a hierarchical Markov decision process (MDP) and is optimized via hierarchical reinforcement learning (HRL). |
Zetao Zheng; Jie Shao; Feiyu Chen; Anjie Zhu; Shilong Deng; Heng Tao Shen; |
| 346 | FieldSwap: Data Augmentation for Effective Form-Like Document Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, building extraction models in this domain often demands a large collection of high-quality training examples. To address this challenge, we introduce FieldSwap, a novel data augmentation technique specifically designed for such extraction problems. |
Jing Xie; James B. Wendt; Yichao Zhou; Seth Ebner; Sandeep Tata; |
| 347 | LT2R: Learning to Online Learning to Rank for Web Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, the existing OLTR solutions fail to learn from the cost-effective logged data, blocking their usage in the real industrial system. To handle the above issues, we in this paper introduce a new OLTR framework LT2R, namely Learning To online Learning to Rank. |
Xiaokai Chu; Changying Hao; Shuaiqiang Wang; Dawei Yin; Jiashu Zhao; Lixin Zou; Chenliang Li; |
| 348 | MUST: An Effective and Scalable Framework for Multimodal Search of Target Modality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, both baselines have limitations in terms of efficiency and accuracy as they fail to adequately consider the varying importance of fusing information across modalities. To overcome these limitations, the paper proposes a novel framework, Multimodal Search of Target Modality, called MUST. |
Mengzhao Wang; Xiangyu Ke; Xiaoliang Xu; Lu Chen; Yunjun Gao; Pinpin Huang; Runkai Zhu; |
| 349 | Online Anomaly Detection Over Live Social Video Streaming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a generic framework for effectively online detecting Anomalies Over social Video LI ve Streaming (AOVLIS). |
Chengkun He; Xiangmin Zhou; Chen Wang; Iqbal Gondal; Jie Shao; Xun Yi; |
| 350 | Computing All Restricted Skyline Probabilities on Uncertain Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering that linear scoring functions are widely used in practical applications, we propose two efficient algorithms for the case where $\mathcal{F}$ is a set of linear scoring functions whose weights are described by linear constraints, one with near-optimal time complexity and the other with better expected time complexity. |
Xiangyu Gao; Jianzhong Li; Dongjing Miao; |
| 351 | M4: A Framework for Per-Flow Quantile Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel framework, M4, designed to estimate per-flow quantiles in data streams accurately. |
Siyuan Dong; Zhuochen Fan; Tianyu Bai; Tong Yang; Hanyu Xue; Peiqing Chen; Yuhan Wu; |
| 352 | DISCO: A Dynamically Configurable Sketch Framework in Skewed Data Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel sketch framework that can be dy-namically configured to optimize the accuracy given a processed data stream. |
Jiaqian Liu; Ran Ben Basat; Louis De Wardt; Haipeng Dai; Guihai Chen; |
| 353 | BitMatcher: Bit-level Counter Adjustment for Sketches Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce BitMatcher, a fast global-adjusting algorithm that automatically adjusts the counter to the appropriate size to match the data stream. |
Qilong Shi; Chengjun Jia; Wenjun Li; Zaoxing Liu; Tong Yang; Jianan Ji; Gaogang Xie; Weizhe Zhang; Minlan Yu; |
| 354 | Space-Efficient Indexes for Uncertain Strings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In particular, we propose an index of $Q (n/ \log z)$ expected size, which can be constructed using $Q (n/ \log z)$ expected space, and supports very fast pattern matching queries in expectation, for patterns of length m ≥ ℓ. |
Esteban Gabory; Chang Liu; Grigorios Loukides; Solon P. Pissis; Wiktor Zuba; |
| 355 | GLO: Towards Generalized Learned Query Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose GLO to address the limitations and step towards generalized learned query optimization. |
Tianyi Chen; Jun Gao; Yaofeng Tu; Mo Xu; |
| 356 | A Fully On-Disk Updatable Learned Index Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: From our experiments, we observe that directly applying the ex-isting in-memory learned indexes into on-disk setting suffers from several drawbacks and cannot outperform a standard B+-tree in most cases. |
Hai Lan; Zhifeng Bao; J. Shane Culpepper; Renata Borovica-Gajic; Yu Dong; |
| 357 | Routing-Guided Learned Product Quantization for Graph-Based Approximate Nearest Neighbor Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present an end-to-end Routing-guided learned Product Quantization (RPQ) for graph-based ANNS, which easily can be adaptive to existing popular PGs. |
Qiang Yue; Xiaoliang Xu; Yuxiang Wang; Yikun Tao; Xuliyuan Luo; |
| 358 | Guided SQL-Based Data Exploration with User Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formulate the automation of personalized SQL-based data exploration as the problem of suggesting the most relevant query and accounting for user feedback at each step. |
Antonis Mandamadiotis; Georgia Koutrika; Sihem Amer-Yahia; |
| 359 | ShrinkHPO: Towards Explainable Parallel Hyperparameter Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ShrinkHPO, an efficient and explainable-designed HPO approach with a major focus on ($a$) efficient hyperparameter configuration search strategy, (b) asynchronous executing intervention, and (c) XAI (eXplainable AI) design. |
Tianyu Mu; Hongzhi Wang; Haoyun Tang; Xinyue Shao; |
| 360 | LBSC: A Cost-Aware Caching Framework for Cloud Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a learning-based cost-aware caching framework called LBSC for cloud databases, ensuring faster query execution and robust performance in dynamic workloads. |
Zhaoxuan Ji; Zhongle Xie; Yuncheng Wu; Meihui Zhang; |
| 361 | DACE: A Database-Agnostic Cost Estimator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their poor robustness and inefficiency lead to their failure to meet the needs of practical scenarios. We propose a lightweight and Database-Agnostic Cost Estimation model (DACE) to address the above limitations. |
Zibo Liang; Xu Chen; Yuyang Xia; Runfan Ye; Haitian Chen; Jiandong Xie; Kai Zheng; |
| 362 | Enhancing LSM-Tree Key-Value Stores for Read-Modify-Writes Via Key-Delta Separation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a notion called key-delta (KD) separation to support efficient reads and RMWs in LSM-tree KV stores under RMW-intensive workloads. |
Jinhong Li; Yanjing Ren; Shujie Han; Patrick P. C. Lee; |
| 363 | TMan: A High-Performance Trajectory Data Management System Based on Key-Value Stores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing systems are inadequate in providing fine-grained trajectory representations and efficient architecture for processing queries, leading to significant computational overhead. This paper introduces TMan to address these challenges. |
Huajun He; Zihang Xu; Ruiyuan Li; Jie Bao; Tianrui Li; Yu Zheng; |
| 364 | Kondo: Efficient Provenance-Driven Data Debloating Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the problem of determining and reducing unused data within a containerized application. |
Aniket Modi; Rohan Tikmany; Tanu Malik; Raghavan Komondoor; Ashish Gehani; Deepak D’Souza; |
| 365 | Preserving Topological Feature with Sign-of-Determinant Predicates in Lossy Compression: A Case Study of Vector Field Critical Points Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contribution is three-fold. (1) We develop a generic theory to derive the allowable perturbation for one row of a matrix while preserving its sign of the determinant. |
Mingze Xia; Sheng Di; Franck Cappello; Pu Jiao; Kai Zhao; Jinyang Liu; Xuan Wu; Xin Liang; Hanqi Guo; |
| 366 | FreqyWM: Frequency Watermarking for The New Data Economy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel technique for modulating the appearance frequency of a few tokens within a dataset for encoding an invisible watermark that can be used to protect ownership rights upon data. |
Devriş İşler; Elisa Cabana; Alvaro Garcia–Recuero; Georgia Koutrika; Nikolaos Laoutaris; |
| 367 | Multi-Modality Is All You Need for Transferable Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we unleash the boundaries of the ID- based paradigm and propose a Pure Multi-Modality based Recommender system (PMMRec), which relies solely on the multi-modal contents of the items (e.g., texts and images) and learns transition patterns general enough to transfer across domains and platforms. |
Youhua Li; Hanwen Du; Yongxin Ni; Pengpeng Zhao; Qi Guo; Fajie Yuan; Xiaofang Zhou; |
| 368 | Multi-view Attentive Variational Learning for Group Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, following the paradigm of variational learning, this paper proposes a multi-view attentive variational preference aggregation network called GroupAV for group rec-ommendation, so as to conduct user/group preference modeling and aggregation in a density-based manner. |
Wen Yang; Jiajie Xu; Rui Zhou; Lu Chen; Jianxin Li; Pengpeng Zhao; Chengfei Liu; |
| 369 | Corruption Robust Dynamic Pricing in Liner Shipping Under Capacity Constraint Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To maximize the cumulative revenue in the C3-MDP setting, we propose a programming framework, Bonus-Exploration based Episodic Programming (BEEP). |
Yongyi Hu; Xueyan Lit; Xikai Wei; Yangguang Shi; Xiaofeng Gao; Guihai Chen; |
| 370 | AdapTraj: A Multi-Source Domain Generalization Framework for Multi-Agent Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building upon the new formulation, we propose AdapTraj, a multi-source domain generalization framework specifically tailored for multi-agent trajectory prediction. |
Tangwen Qian; Yile Chen; Gao Cong; Yongjun Xu; Fei Wang; |
| 371 | Towards Effective Next POI Prediction: Spatial and Semantic Augmentation with Remote Sensing Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, the uneven POI distribution further complicates the next POI prediction procedure. To address these challenges, we enrich input features and propose an effective deep-learning method within a two-step prediction framework. |
Nan Jiang; Haitao Yuan; Jianing Si; Minxiao Chen; Shangguang Wang; |
| 372 | KartGPS: Knowledge Base Update with Temporal Graph Pattern-based Semantic Rules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present KartGPS, a system for KB updating taking advantage of temporal graph pattern-based semantic (tGPS) rules. |
Hao Xin; Lei Chen; |
| 373 | Optimizing Probabilistic Box Embeddings with Distance Measures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The distance measures can naturally measure the “degree of disjointedness” for disjoint boxes and provide reasonable gradients for optimization. For the second problem, we theoretically prove that under certain conditions, the gradient would vanish exponentially, and therefore, make the optimization converges to suboptimal solutions. |
Lang Mei; Jiaxin Mao; Ji-Rong Wen; |
| 374 | A Multi-View Clustering Algorithm for Short Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, the topic model based short text clustering algorithms represent short texts as bag-of-words, while the deep clustering models represent short texts as document embeddings. To address these issues, we propose a Multi-View Clustering (MVC) model that considers both views of the text. |
Minkuan Lu; Jianhua Yin; Kaijun Wang; Liqiang Nie; |
| 375 | GaussDB-Global: A Geographically Distributed Database System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These systems are sensitive to severe latency penalties caused by centralized transaction management, remote access to sharded data, and log shipping over long distances. To tackle these issues, we present GaussDB-Global, a sharded geographically distributed database system with asynchronous replication, for OLTP applications. |
Puya Memarzia; Huaxin Zhang; Kelvin Ho; Ronen Grosman; Jiang Wang; |
| 376 | Towards A Shared-Storage-Based Serverless Database Achieving Seamless Scale-Up and Read Scale-Out Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: (2) they lack the ability to scale out secondary nodes due to the absence of strong consistency support in secondary nodes. Based on our experience in building serverless databases, this paper proposes two fundamental requirements to address these two issues: seamless and instant migration and read scale-out. |
Yingqiang Zhang; Xinjun Yang; Hao Chen; Feifei Li; Jiawei Xu; Jie Zhou; Xudong Wu; Qiang Zhang; |
| 377 | Optimized Locking in SQL Azure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a technique called transaction-id locking that drastically reduces the number of in-memory locks and eliminates lock escalation. |
Chaitanya Sreenivas Ravella; Prashanth Purnananda; Hanuma Kodavalla; Peter Byrne; Adrian-Leonard Radu; Wayne Chen; Srikanth Sampath; Naga Bhavana Atluri; Srinag Rao; Priyanka Kakade; |
| 378 | Separation Is for Better Reunion: Data Lake Storage at Huawei Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we introduce a stream (storage) object as a storage abstraction for message streaming data to achieve the storage-disaggregated architecture with high scalability and reliability. |
Xin Tang; Chengliang Chai; Dawei Zhao; Haohai Ma; Yong Zheng; Zhenyong Fan; Xin Wu; Jiaquan Zhang; Rui Zhang; Duanshun Li; Yi He; Keji Huang; Guangbin Meng; Yidong Wang; Yuefeng Zhou; Tao Tao; Lirong Jian; Jiwu Shu; Yuping Wang; Ye Yuan; Guoren Wang; Guoliang Li; |
| 379 | Deep Learning with Spatiotemporal Data: A Deep Dive Into GeotorchAI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering the limitations of existing deep learning frameworks, we present GeoTorchAI, a framework for deep learning and scalable data processing on raster imagery and spatiotemporal non-imagery datasets. |
Kanchan Chowdhury; Mohamed Sarwat; |
| 380 | DATALORE: Can A Large Language Model Find All Lost Scrolls in A Data Repository? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, data transformations are often not well-documented or completely missing, resulting in poor traceability, reproducibility and explainability of ML pipelines. In this paper, we propose DATALoRE, a framework that explains data changes between an initial dataset and its augmented version to improves traceability. |
Yuze Lou; Chuan Lei; Xiao Qin; Zichen Wang; Christos Faloutsos; Rishita Anubhai; Huzefa Rangwala; |
| 381 | Etude – Evaluating The Inference Latency of Session-Based Recommendation Models at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result, data scientists must typically prototype and evaluate different deployment options in collaboration with devops teams – a tedious and costly process, which does not scale to multiple use cases. To alleviate this, we present Etude, an end-to-end bench-marking framework, which enables data scientists to automati-cally evaluate the inference performance of SBR models under different deployment options. |
Barrie Kersbergen; Olivier Sprangers; Frank Kootte; Shubha Guha; Maarten de Rijke; Sebastian Schelter; |
| 382 | CoachLM: Automatic Instruction Revisions Improve The Data Quality in LLM Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, instead of discarding low-quality samples, we propose CoachLM, a novel approach to enhance the quality of instruction datasets through automatic revisions on samples in the dataset. |
Yilun Liu; Shimin Tao; Xiaofeng Zhao; Ming Zhu; Wenbing Ma; Junhao Zhu; Chang Su; Yutai Hou; Miao Zhang; Min Zhang; Hongxia Ma; Li Zhang; Hao Yang; Yanfei Jiang; |
| 383 | GaussML: An End-to-End In-Database Machine Learning System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, UDFs may introduce security risks with vulnerable code, and suffer from performance problems, as constrained by data access and execution patterns of SQL query operators. To address these limitations, we propose a new in-database machine learning system, namely GaussML, which provides an end-to-end machine-learning ability with native SQL interface. |
Guoliang Li; Ji Sun; Lijie Xu; Shifu Li; Jiang Wang; Wen Nie; |
| 384 | Xorbits: Automating Operator Tiling for Distributed Data Science Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing systems often struggle with processing large datasets due to Out-of-Memory (OOM) problems caused by poor data partitioning. To overcome these challenges, we develop Xorbits, a high-performance, scalable data science framework specifically designed to distribute data science workloads across clusters while retaining familiar APIs. |
Weizheng Lu; Kaisheng He; Xuye Qin; Chengjie Li; Zhong Wang; Tao Yuan; Xia Liao; Feng Zhang; Yueguo Chen; Xiaoyong Du; |
| 385 | Couler: Unified Machine Learning Workflow Optimization in Cloud Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we design and implement Couler, a system designed for unified ML workflow optimization in the cloud. |
Xiaoda Wang; Yuan Tang; Tengda Guo; Bo Sang; Jiewei Wu; Jian Sha; Ke Zhang; Jiang Qian; Mingjie Tang; |
| 386 | AntDT: A Self-Adaptive Distributed Training Framework for Leader and Straggler Nodes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, it is challenging to use a systematic framework to address all stragglers because different stragglers require diverse data allocation and fault-tolerance mechanisms. Therefore, this paper proposes a unified distributed training framework called AntDT (Ant Distributed Training Framework) to adaptively solve the straggler problems. |
Youshao Xiao; Lin Ju; Zhenglei Zhou; Siyuan Li; Zhaoxin Huan; Dalong Zhang; Rujie Jiang; Lin Wang; Xiaolu Zhang; Lei Liang; Jun Zhou; |
| 387 | Addressing The Nested Data Processing Gap: JSONiq Queries on Snowflake Through Snowpark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the shortcomings of the latter approach by translating a language specifically designed for nested data, JSONiq, to a highly efficient, scalable, and feature rich RDBMS, the Snowflake Database. |
Dan Graur; Remo Röthlisberger; Adrian Jenny; Ghislain Fourny; Filip Drozdowski; Choden Konigsmark; Ingo Müller; Gustavo Alonso; |
| 388 | Bwe-tree: An Evolution of Bw-tree on Fast Storage Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Microsoft Research proposed Bw-tree, a variant of B+ tree layered on top of log structured storage. |
Rui Wang; Xinjun Yang; Feifei Li; David B. Lomet; Xin Liu; Panfeng Zhou; Yongxiang Chen; David Zhang; Jingren Zhou; Jiesheng Wu; |
| 389 | Resource Allocation with Service Affinity in Large-Scale Cloud Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In cloud resource scheduling, collocating service containers that frequently communicate to the same machine – termed “service affinity” – is instrumental in enhancing application performance. In response to this concern, we present a solution that harnesses service affinity and collocates containers to enhance the overall system performance and stability. |
Zuzhi Chen; Fuxin Jiang; Binbin Chen; Yu Li; Yunkai Zhang; Chao Huang; Rui Yang; Fan Jiang; Jianjun Chen; Wu Xiang; Guozhu Cheng; Rui Shi; Ning Ma; Wei Zhang; Tieying Zhang; |
| 390 | Online Index Recommendation for Slow Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, the DAS platform has accumulated lots of index creation samples. In this paper, we introduce index learner (IdxL), designed to learn index creation knowledge from these informative index data. |
Gan Peng; Peng Cai; Kaikai Ye; Kai Li; Jinlong Cai; Yufeng Shen; Han Su; Weiyuan Xu; |
| 391 | On Tuning Raft for IoT Workload in Apache IoTDB Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose to explore the opportunities of tuning Raft for the particular IoT workload, including alternative data structures, various compression algorithms, memory recycling strategies, etc. |
Tian Jiang; Xiangdong Huang; Shaoxu Song; Chen Wang; Jianmin Wang; |
| 392 | Enabling Roll-Up and Drill-Down Operations in News Exploration with Knowledge Graphs for Due Diligence and Risk Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce NCEXPLORER, a framework designed with OLAP-like operations to enhance the news exploration experience. |
Sha Wang; Yuchen Li; Hanhua Xiao; Zhifeng Bao; Lambert Deng; Yanfei Dong; |
| 393 | Multifaceted Reformulations for Null & Low Queries and Its Parallelism with Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the user experience for N&L queries, we propose a novel method that leverages the capabilities of a neural translation model to provide diverse and multiple reformulations. |
Jayanth Yetukuri; Yuyan Wang; Ishita Khan; Liyang Hao; Zhe Wu; Yang Liu; |
| 394 | An Effective, Efficient, and Stable Framework for Query Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Yahoo! Trending Now lists the most trending ten user queries from Yahoo! Search. To discover top trending queries, query clustering is a critical intermediate phase that … |
Chang Lu; Liuqing Li; Donghyun Kim; Xinyue Wang; Rao Shen; |
| 395 | A Framework for Continuous KNN Ranking of EV Chargers with Estimated Components Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an innovative framework whose objective is to allow drivers to recharge their Electric Vehicles (EVs) from the most environmentally friendly chargers using an intelligent hoarding approach. |
Soteris Constantinou; Constantinos Costa; Constantinos Costa; Andreas Konstantinidis; Andreas Konstantinidis; Mohamed F. Mokbel; Demetrios Zeinalipour-Yazti; |
| 396 | Large Language Models: Principles and Practice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The tutorial covers the fundamental principles enabling language models, including the Transformer architecture, pre-training, and alignment. |
Immanuel Trummer; |
| 397 | Bipartite Graph Analytics: Current Techniques and Future Trends Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We start by outlining the importance of bipartite graph analytics, and the unique challenges that need to be addressed. |
Hanchen Wang; Kai Wang; Wenjie Zhang; Ying Zhang; |
| 398 | Privacy-Aware Analysis Based on Data Series Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Data series often contain sensitive information, though, about the individuals that act as service consumers or service providers. |
Stephan Fahrenkrog-Petersen; Han van der Aa; Matthias Weidlich; |
| 399 | Robust Query Optimization in The Era of Machine Learning: State-of-the-Art and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this tutorial, we explore the notion of robustness in the context of query optimization, as well as how it is evaluated or even further supported. |
Amin Kamali; Verena Kantere; Calisto Zuzarte; |
| 400 | Quantum Data Management: From Theory to Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We aim to shed light on the uncharted territory of future data systems tailored for the quantum internet. |
Rihan Hai; Shih-Han Hung; Sebastian Feld; |
| 401 | An Interactive Dive Into Time-Series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this tutorial, we take a holistic view of anomaly detection in time series, starting from the core definitions and taxonomies related to time series and anomaly types, to an extensive description of the anomaly detection methods proposed by different communities in the literature. |
Paul Boniol; John Paparrizos; Themis Palpanas; |
| 402 | A Comprehensive Tutorial on Over 100 Years of Diagrammatic Representations of Logical Statements and Relational Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We survey the history and state-of-the-art of relationally-complete diagrammatic representations of relational queries, discuss the key visual metaphors developed in over a century of investigations into dia-grammatic languages, and organize the landscape by mapping the visual alphabets of diagrammatic representation systems to the syntax and semantics of Relational Algebra (RA) and Relational Calculus (RC). |
Wolfgang Gatterbauer; |
| 403 | Entity/Relationship Profiling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Entity/Relationship (E/R) Profiling as the discovery-oriented, data-driven counterpart of E/R Modeling. |
Henning Koehler; Sebastian Link; |
| 404 | GA-Tag: Data Enrichment with An Automatic Tagging System Utilizing Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce a tagging method that simplifies the handling of extensive data and facilitates the rapid search and extraction of relevant information. |
Genki Kusano; |
| 405 | Comparing Personalized Relevance Algorithms for Directed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present an interactive Web platform that, given a directed graph, allows identifying the most relevant nodes related to a given query node. |
Luca Cavalcanti; Cristian Consonni; Martin Brugnara; David Laniado; Alberto Montresor; |
| 406 | FSM-Explorer: An Interactive Tool for Frequent Subgraph Pattern Mining From A Big Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this demonstration paper, we describe FSM-Explorer, an interactive tool that makes it easier for end-users to mine frequent subgraph patterns from a big graph $G$, and to explore the subgraph instances in $G$ that match the patterns. |
Jalal Khalil; Da Yan; Lyuheng Yuan; Jiao Han; Saugat Adhikari; Cheng Long; Yang Zhou; |
| 407 | TASKS: A Real-Time Query System for Instant Error-Tolerant Spatial Keyword Queries on Road Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, entering complete queries (e.g., the query keywords) can be cumbersome and prone to errors. To overcome these limitations, we present a real-time query system called TASKS for instant error-tolerant spatial keyword queries on road networks. |
Chengyang Luo; Lu Jin; Qing Liu; Yunjun Gao; Lu Chen; |
| 408 | VASIM: Vertical Autoscaling Simulator Toolkit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces VASIM, an autoscaling simulator toolkit designed for testing recommendation algorithms, with a particular focus on CPU usage in VMs and Kubernetes pods. |
Anna Pavlenko; Karla Saur; Yiwen Zhu; Brian Kroth; Joyce Cahoon; Jesús Camacho-Rodríguez; |
| 409 | Demonstration of FeVisQA: Free-Form Question Answering Over Data Visualization Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Question Answering (QA) systems playa vital role in knowledge acquisition. CodeQA refers to question answering (QA) over source code for code comprehension purpose. However, … |
Yuanfeng Song; Jinwei Lu; Xuefang Zhao; Raymond Chi-Wing Wong; Haodi Zhang; |
| 410 | CleanEr: Interactive, Query-Guided Error Mitigation for Data Cleaning Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CleanEr, a generic framework that is used on top of existing data cleaning systems and that assists users in identifying the impact of potential cleaning errors on query results, and in deciding accordingly whether and how to proceed with the cleaning. |
Ran Schreiber; Yael Amsterdamer; |
| 411 | Wearables for Health (W4H) Toolkit for Acquisition, Storage, Analysis and Visualization of Data from Various Wearable Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Wearables for Health Toolkit (W4H Toolkit) is an open-source platform that provides a robust, end-to-end solution for the centralized management and analysis of wearable data. |
Arash Hajisafi; Maria Despoina Siampou; Jize Bi; Luciano Nocera; Cyrus Shahabi; |
| 412 | Chat2Query: A Zero-Shot Automatic Exploratory Data Analysis System with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Chat2Query, an LLM -empowered zero-shot automatic exploration data analysis system. |
Jun-Peng Zhu; Peng Cai; Boyan Niu; Zheming Ni; Kai Xu; Jiajun Huang; Jianwei Wan; Shengbo Ma; Bing Wang; Donghui Zhang; Liu Tang; Qi Liu; |
| 413 | EADS: An Early Anomaly Detection System for Sensor-Based Multivariate Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By meticulously analyzing these changes, CAD excels in ascertaining the precise time of anomalies and identifying the implicated sensors. In this demonstration, we introduce EADS, an Early Anomaly Detection System built upon CAD for sensor-based MTS. |
Yihao Ang; Qiang Huang; Anthony K. H. Tung; Zhiyong Huang; |
| 414 | Dsymb Playground: An Interactive Tool to Explore Large Multivariate Time Series Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have also empirically shown that the computation time when using dsymb on a clustering time is significantly smaller than with DTW variants (typically 100 times faster). In this demonstration, we present the dsymb playground, an interactive web-based tool to interpret and compare a large multivariate time series dataset quickly. |
Sylvain W. Combettes; Paul Boniol; Charles Truong; Laurent Oudre; |
| 415 | ADecimo: Model Selection for Time Series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite increasing academic interest and the large number of methods proposed in the literature, recent benchmark and evaluation studies demonstrated that there exists no single best anomaly detection method when applied to heterogeneous time series datasets. Therefore, the only scalable and viable solution to solve anomaly detection over very different time series collected from diverse domains is to propose a model selection method that will choose, based on time series characteristics, the best anomaly detection method to run. |
Paul Boniol; Emmanouil Sylligardos; John Paparrizos; Panos Trahanias; Themis Palpanas; |
| 416 | ChatGraph: Chat with Your Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the limitations, we propose a large language model (LLM)-based framework called Chat-Graph. |
Yun Peng; Sen Lin; Qian Chen; Shaowei Wang; Lyu Xu; Xiaojun Ren; Yafei Li; Jianliang Xu; |
| 417 | A Fast Plan Enumerator for Recursive Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe a complete system of query optimization with parsers and compilers adapted for recursive queries over knowledge and property graphs. |
Amela Fejza; Pierre Genevès; Nabil Layaïda; |
| 418 | KGSEC: A Modular Framework for Knowledge Graph Schema Extraction and Comparison Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the lack of a common framework and evaluation metrics makes it difficult to combine them and compare the results. To fill this gap, we present a modular three-stage framework and we have developed a Python library and web application that performs schema extraction and allows users to visually assess and compare the results. |
Petros Skoufis; Dimitrios Skoutas; |
| 419 | QFusor: A UDF Optimizer Plugin for SQL Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this comes at a significant performance cost as UDFs routinely become the bottleneck in query execution. To deal with this problem, we present QFusor, an optimizer plugin for UDF queries in relational databases. |
Konstantinos Chasialis; Theoni Palaiologou; Yannis Foufoulas; Alkis Simitsis; Yannis Ioannidis; |
| 420 | ARTS: A System for Aggregate Related Table Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing table search techniques define table relatedness with unionablility and/or joinability. |
Junjie Xing; H. V. Jagadish; |
| 421 | Explaining Expert Search Systems with ExES Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, state-of-the-art solutions to this problem lack transparency and interpretability. To address this issue, we demonstrate ExES, an interactive tool designed to explain expert search systems. |
Kiarash Golzadeh; Lukasz Golab; Jaroslaw Szlichta; |
| 422 | RAGE Against The Machine: Retrieval-Augmented LLM Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates RAGE, an interactive tool for explaining Large Language Models (LLMs) augmented with retrieval capabilities; i.e., able to query external sources and pull relevant information into their input context. |
Joel Rorseth; Parke Godfrey; Lukasz Golab; Divesh Srivastava; Jaroslaw Szlichta; |
| 423 | FairCR – An Evaluation and Recommendation System for Fair Classification Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Numerous algorithms have been proposed to tackle biased predictions leading to such discrimination, particularly for classification problems. These algorithms typically aim to reduce bias as defined by specific metrics. |
Nico Lässig; Melanie Herschel; Ole Nies; |
| 424 | GraphLingo: Domain Knowledge Exploration By Synchronizing Knowledge Graphs and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate GraphLingo, a natural language (NL)-based knowledge exploration system designed for exploring domain-specific knowledge graphs. |
Duy Le; Kris Zhao; Mengying Wang; Yinghui Wu; |
| 425 | MixedSearch: An Interactive System of Searching for The Best Tuple with Mixed Attributes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although there are some strategies to convert categorical attributes to numerical attributes, the conversion not only incurs poor efficiency, but also requires heavy interactive effort. In light of this, we developed an interactive system, called MixedSearch, and demonstrated that the system could find the best tuples for users in the database described by mixed attributes. |
Weicheng Wang; Min Xie; Raymond Chi-Wing Wong; |
| 426 | MorphStream: Scalable Processing of Transactions Over Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the realm of transactional stream processing (TSP), the challenge lies in providing a unified execution model that seamlessly integrates transactional and stream-oriented capabilities. |
Siqi Xiang; Zhonghao Yang; Jianjun Zhao; Yancan Mao; Shuhao Zhang; |
| 427 | FONT: A Flexible Polystore Evaluation Platform Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most current evaluation approaches for polystores only focus on cross-model analytical workloads and provide limited configurations. To address these problems, this paper presents a flexible polystore evaluation platform named FONT. |
Gengyuan Shi; Chaokun Wang; Minghao Zhang; Binbin Wang; |
| 428 | CAMO: Explaining Consensus Across MOdels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Explainable AI methods have been proposed to help interpret complex models, e.g., by assigning importance scores to model features or perturbing the features in a way that changes the prediction. |
Andy Yu; Parke Godfrey; Lukasz Golab; Divesh Srivastava; Jaroslaw Szlichta; |
| 429 | Pyneapple-R: Scalable and Expressive Spatial Regionalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through collaborations with social scientists and domain experts, we have identified emerging challenges in existing regionalization techniques, particularly regarding scalability and expressiveness. |
Yunfan Kang; Yongyi Liu; Hussah Alrashid; Akash Bilgi; Siddhant Purohit; Ahmed Mahmood; Sergio Rey; Amr Magdy; |
| 430 | SQL++: We Can Finally Relax! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe SQL++, a SQL extension that relaxes SQL’s strictness in terms of both object structure (flat → nested) and schema (mandatory → optional), along with a multi-party effort to agree on a core definition and syntax supportable by multiple vendors. |
Michael Carey; Don Chamberlin; Almann Goo; Kian Win Ong; Yannis Papakonstantinou; Chris Suver; Sitaram Vemulapalli; Till Westmann; |
| 431 | Data Flow Architectures for Data Processing on Modern Hardware Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we argue that data management engines on modern hardware will necessarily be based on data flow designs where processing happens in a streaming and pipelined fashion across the entire architecture, a radical departure from existing engines. In the paper we argue why this will be the case, the advantages of such designs, and outline a research program to allow data processing engines take advantage of hardware developments. |
Alberto Lerner; Gustavo Alonso; |
| 432 | Personal Manifold: Management of Personal Data in The Age of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the visionary PERSONAL MANIFOLD system that supports a personal agent based on LLMs, tackles some of the associated data management challenges, and exposes others. |
Alon Halevy; Yuliang Li; Wang-Chiew Tan; |
| 433 | Applications and Challenges for Large Language Models: From Data Management Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Data management is indispensable for informed decision-making in the big data era. In the meantime, Large Language Models (LLMs), equipped with billions of model parameters and … |
Meihui Zhang; Zhaoxuan Ji; Zhaojing Luo; Yuncheng Wu; Chengliang Chai; |
| 434 | Routing with Massive Trajectory Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we cover aspects of this research. |
Christian S. Jensen; Bin Yang; Chenjuan Guo; Jilin Hu; Kristian Torp; |
| 435 | When Data Pricing Meets Non-Cooperative Game Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a blueprint for applying game theory to data pricing. |
Yuran Bi; Yihang Wu; Jinfei Liu; Kui Ren; Li Xiong; |
| 436 | Secure Normal Form: Mediation Among Cross Cryptographic Leakages in Encrypted Databases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a new architecture to support secure normal form. |
Shufan Zhang; Xi He; Ashish Kundu; Sharad Mehrotra; Shantanu Sharma; |
| 437 | Reactive Knowledge Management Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we propose the design and prototyping of the next generation of knowledge management concepts and systems, which will support domain diversity and scientific evolution as foundational ingredients. |
Stefano Ceri; Anna Bernasconi; Alessia Gagliardi; |
| 438 | LakeHarbor: Making Structures First-Class Citizens in Data Lakes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces LakeHarbor, a new data management paradigm that makes structures (e.g., indexes) first-class citizens in data lakes. |
Hiroyuki Yamada; Masaru Kitsuregawa; Kazuo Goda; |
| 439 | A CXL- Powered Database System: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a thorough analysis of CXL’s key characteristics, this paper identifies emerging opportunities, particularly in buffer pool expansion, memory elasticity, swift data recovery, and index optimization. More importantly, this paper outlines a series of new challenges accompanying these opportunities, with the objective of inspiring cutting-edge approaches in future DBMS design that emphasize efficiency, reliability, and reduced total cost of ownership. |
Yunyan Guo; Guoliang Li; |
| 440 | BIFROST: A Future Graph Database Runtime Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The approach provides high fidelity for even highly irregular labeled property graphs and gives good performance when compared to other systems that depend on fixed schemas for query planning and optimization. |
James Clarkson; Georgios Theodorakis; Jim Webber; |
| 441 | V2V: Efficiently Synthesizing Video Results for Video Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe V2V, a system to efficiently synthesize video results for video queries. |
Dominik Winecki; Arnab Nandi; |
| 442 | Higher-Order SQL Lambda Functions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of extracting runnable code and data out of a database system, we propose higher-order SQL lambda functions for in-database execution. |
Maximilian E. Schüle; Jakob Hornung; |
| 443 | PR-GNN: Enhancing PoC Report Recommendation with Graph Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is a limitation in the current modeling of different types of trigger methods in PoC reports, as it mainly focuses only on code-based trigger methods while disregarding other types of trigger methods. To tackle the issues, this Ph.D. research uses graph-based method to model all types of PoC and proposes a PoC report recommendation model utilizing graph neural network (PR-GNN) to provide related PoC reports when facing a new vulnerability. |
Jiangtao Lu; Song Huang; |
| 444 | Cascade: Optimal Transaction Scheduling for High-Contention Workloads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate what it means for a batch to be executed optimally in the face of high contention. |
Tim Baccaert; Bas Ketsman; |
| 445 | Construction and Enhancement of An RNA-Based Knowledge Graph for Discovering New RNA Drugs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Even if many structured and unstructured data sources report the interaction among different RNA molecules and some other biomedical entities (e.g., drugs, diseases, genes), we still lack a comprehensive and well-described RNA-centered Knowledge Graph (KG) that contains such information and sophisticated services that support the user in its creation, maintenance, and enhancement. This PhD project aims to create a biomedical KG (named RNA-KG) to represent, and eventually infer, biological, experimentally validated interactions between different RNA molecules. |
Emanuele Cavalleri; Marco Mesiti; |
| 446 | Enhancing Data Systems Performance By Exploiting SSD Concurrency & Asymmetry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite these, most storage-intensive applications are not optimized for SSD asymmetry and concurrency, often leading to device underuti-lization. In this thesis, we uncover these crucial SSD properties and outline how we can better exploit these properties from the application perspective. |
Tarikul Islam Papon; |
| 447 | Differential Analysis for System Provenance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop algorithms that report all the differences precisely across two execution traces generated from the same application’s provenance graph structure. |
Yuta Nakamura; Tanu Malik; |
| 448 | Evaluating Text-to-SQL Model Failures on Real-World Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify three main challenges in real-world Text-to-SQL applications: long context length, unclear question formulation, and greater query complexity. |
Manasi Ganti; Laurel Orr; Sen Wu; |
| 449 | Synergies Between Graph Data Management and Machine Learning in Graph Data Pipeline Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate how graph data management, which deals with effective, efficient, scalable, and user-friendly systems and algorithms for storing, processing, and analyzing large volumes of heterogeneous and complex graphs, could benefit from graph machine learning and vice versa, over the end-to-end graph data pipeline. |
Arijit Khan; |
| 450 | Large Language Models As Storage for SQL Querying Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Declarative querying is one of the main features behind the popularity of databases. However, SQL can be executed only on structured datasets, leaving out of immediate reach … |
Paolo Papotti; |
| 451 | Accelerating Deletion Interventions on OLAP Workload Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use provenance to propagate the deletion status of tuples per operator, in a tight loop that leads to improvement in instruction and data locality. |
Haneen Mohammed; Alexander Yao; Lampros Flokas; Zhong Hongbin; Charlie Summers; Eugene Wu; |
| 452 | User Learning In Interactive Data Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an analysis of existing data exploration logs to quantify shifts in users’ data exploration strategies over time. |
Sanad Saha; Nischal Aryal; Leilani Battle; Arash Termehchy; |
| 453 | Multivariate Similarity Search – A Call for A New Breed of Similarity Search Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this talk we will revisit similarity search under the lens of multivariate similarity measures. |
Odysseas Papapetrou; Jens E. d’Hondt; |
| 454 | Towards Streaming Consistency Management Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Stream processing is designed to query unbounded and timely-ordered data flows in real-time while guaranteeing low latency and high throughput. … |
Samuele Langhi; Angela Bonifati; Riccardo Tommasini; |
| 455 | Unveiling Dis-Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The individual state-of-the-art ER algorithms are offered through open-source systems, such as Magellan [2] and JedAI [3], which typically implement end-to-end solutions through a sequence of workflow steps. |
George Papadakis; Ekaterini Ioannou; Yannis Velegrakis; |
| 456 | Cross-Source ML Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Machine learning (ML) often operates on data fragmented across silos through two paradigms: distributed or centralized. This study illuminates the underexplored signifi-cance of … |
Wenbo Sun; Rihan Hai; |
| 457 | Why Model-Based Lossy Compression Is Great for Wind Turbine Analytics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Modern wind turbines are equipped with wired high-quality sensors that produce high-frequency sensor data in the form of time series as shown in Figure 1 a. From working with multiple different practitioners, we have learned that relatively few but very long high-quality time series are produced. |
Søren Kejser Jensen; Christian Thomsen; Torben Bach Pedersen; Carlos Enrique Muñiz-Cuza; Abduvoris Abduvakhobov; |
| 458 | Towards Explainability in Retrieval-Augmented LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In an era where artificial intelligence (AI) is re-shaping countless aspects of society, we present a forward-looking perspective for enhancing the explainability of large language models (LLMs), with a particular focus on the retrieval-augmented generation (RAG) prompting technique. |
Joel Rorseth; Parke Godfrey; Lukasz Golab; Divesh Srivastava; Jaroslaw Szlichta; |
| 459 | Benchmarking Data Management Systems for Microservices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Data exchanges and communication among microservices are often achieved via asynchronous events. |
Rodrigo Laigner; Yongluan Zhou; |
| 460 | Exploring The Space of Model Comparisons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The ML deployment pipeline suffers further from a fracture: on the one side, one has the “data-science” (DS) pipeline, in which one extracts, loads, transforms, and maintains the vast lakes of data the models need to be trained on; on the other side, one has the “ML” pipeline in which experts test and evaluate models, often comparing many, for fitness for the task. To progress ultimately, these two pipelines must be integrated into a single DS/ML pipeline. |
Andy Yu; Parke Godfrey; Lukasz Golab; Divesh Srivastava; Jaroslaw Szlichta; |
| 461 | On Native Location-Optimized Data Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the ubiquity of location detection devices and location services and the massive amounts of the location data being produced, there is dire need for developing highly … |
Walid G. Aref; |
| 462 | Observations and Opportunities in Solving Large-Scale Graph Data Processing Challenges at ByteDance By Using Heterogeneous Hardware Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This talk will outline the challenges ByteDance encounters in scaling and processing graph data and will highlight two scenarios: real-time incremental graph processing using CPU-GPU combinations (speed up 13.1x), and dynamic graph random walks on FPGA clusters (speed up 6x). |
Cheng Chen; Shuai Zhang; |
| 463 | Data Lakes: A Survey of Functions and Systems (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hope that the thorough comparison of existing solutions and the discussion of open research challenges in this survey will motivate the future development of data lake research and practice. |
Rihan Hai; Christos Koutras; Christoph Quix; Matthias Jarke; |
| 464 | OOD-GNN: Out-of-Distribution Generalized Graph Neural Network: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing GNNs lack out-of-distribution generalization abilities so that their performance substantially degrades when there exist distribution shifts between testing and training graph data. To solve this problem, we propose an out-of-distribution generalized graph neural network (OOD-GNN) for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs. |
Haoyang Li; Xin Wang; Ziwei Zhang; Wenwu Zhu; |
| 465 | Hierarchical Adaptive Pooling By Capturing High-order Dependency for Graph Representation Learning (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing pooling methods either struggle to capture local substructures or fail to utilize high-order dependency, thus diminishing the expression capability. To solve this problem, we propose HAP, a hierarchical graph-level representation learning framework adaptively sensitive to graph structures. |
Ning Liu; Songlei Jian; Dongsheng Li; Yiming Zhang; Zhiquan Lai; Hongzuo Xu; |
| 466 | PLAME: Piecewise-Linear Approximate Measure for Additive Kernel SVM (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most of the existing methods normally fail to achieve one of these three important conditions which are (1) low classification error, (2) low memory space, and (3) low training time. In order to simultaneously fulfill these three conditions, we develop the new piecewise-linear approximate measure (PLAME) for training additive kernel SVM models. |
Tsz Nam Chan; Zhe Li; Leong Hou U; Reynold Cheng; |
| 467 | Short-Text Author Linking Through Multi-Facet Temporal-Textual Embedding (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We devise a neural network-based temporal-textual framework that generates subgraphs with highly correlated authors from short-text contents. |
Saeed Najafipour Najafipour; Saeid Hosseini; Wen Hua; Mohammad Reza Kangavari; Xiaofang Zhou; |
| 468 | DKWS: A Distributed System for Keyword Search on Massive Graphs (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Addressing the complexities of querying unstructured graphs such as knowledge graphs and social networks, this paper introduces D KWS, a novel distributed keyword search system. |
Jiaxin Jiang; Byron Choi; Xin Huang; Jianliang Xu; Sourav S Bhowmick; |
| 469 | Multi-Grained Semantics-Aware Graph Neural Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a unified model, AdamGNN, to interactively learn node and graph representations in a mutual-optimisation manner. |
Zhiqiang Zhong; Cheng–Te Li; Jun Pang; |
| 470 | Distilled Neural Networks for Efficient Learning to Rank: (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework to design and train neural networks outperforming ensembles of regression trees. |
Franco Maria Nardini; Cosimo Rulli; Salvatore Trani; Rossano Venturini; |
| 471 | Higher-Order Truss Decomposition in Graphs (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Graphs have been widely used to represent the relationships of entities in real-world applications [1], [2]. k-truss model is a typical cohesive subgraph model and has received considerable attention due to its unique cohesive properties on degree and bounded diameter [3], [4]. |
Zi Chen; Long Yuan; Li Han; Zhengping Qian; |
| 472 | Finding The Maximum K- Balanced Biclique on Weighted Bipartite Graphs (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As a popular data structure, bipartite graph is widely used to model the complex relationships between two types of entities widely in many real world application domains[1]. … |
Yiwei Zhao; Zi Chen; Chen Chen; Xiaoyang Wang; Xuemin Lin; Wenjie Zhang; |
| 473 | Enabling Efficient, Verifiable, and Secure Conjunctive Keyword Search in Hybrid-Storage Blockchains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formally define the problem of efficient, verifiable, and secure conjunctive keyword search in hybrid-storage blockchains (vsChain) and propose a novel hybrid index that achieves efficient query and verification while supporting dynamic updates with forward privacy guarantee. |
Ningning Cui; Dong Wang; Jianxin Li; Huaijie Zhu; Xiaochun Yang; Jianliang Xu; Jie Cui; Hong Zhong; |
| 474 | Hybrid Regret Minimization: A Submodular Approach (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the hybrid regret min-imization (HRM) query, a new method to extract representative tuples from databases. |
Jiping Zheng; Fanxu Meng; Yanhao Wang; Xiaoyang Wang; Sheng Wang; Yuan Ma; Zhiyang Hao; |
| 475 | A Neural Database for Answering Aggregate Queries on Incomplete Relational Data (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Real-world datasets are often incomplete due to data collection cost, privacy considerations or as a side effect of data integration/preparation. We focus on answering aggregate queries on such datasets, where data incompleteness causes the answers to be inaccurate. |
Sepanta Zeighami; Raghav Seshadri; Cyrus Shahabi; |
| 476 | Mutual Information-Guided GA for Bayesian Network Structure Learning (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Genetic algorithms are powerful for solving combinatorial optimization problems, but the lack of effective guidance results in slow convergence and low accuracy regarding BNSL. To address this problem, we propose a mutual information (MI) guided genetic algorithm (MIGA) for BNSL in this paper, which uses MI to effectively search BN structures. |
Kefei Yan; Wei Fang; Hengyang Lu; Xin Zhang; Jun Sun; Xiaojun Wu; |
| 477 | Efficient Discovery of Functional Dependencies on Massive Data (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Functional dependencies (FDs) are the most common constraints in the design theory for relational databases, generalizing the concept of a key for a relation. Given an attribute … |
Xiaolong Wan; Xixian Han; Jinbao Wang; Jianzhong Li; |
| 478 | Neural Similarity Search on Supergraph Containment (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the first learning-based method for similarity search on supergraph containment, named Neural Supergraph similarity Search (NSS). |
Hanchen Wang; Jianke Yu; Xiaoyang Wang; Chen Chen; Wenjie Zhang; Xuemin Lin; |
| 479 | Contact Tracing Over Uncertain Indoor Positioning Data (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we formulate a novel query called Indoor Contact Query (ICQ) over raw, uncertain indoor positioning data that digitalizes people’s indoor mobility. |
Tiantian Liu; Huan Li; Hua Lu; Muhammad Aamir Cheema; Harry Kai-Ho Chan; |
| 480 | Efficient Semi-External SCC Computation (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Computing strongly connected components (SCC) is a key operation for many applications on directed graphs. Specifically, a SCC of a directed graph $G$ is one of its maximal … |
Xiaolong Wan; Hongzhi Wang; |
| 481 | Value-Wise ConvNet for Transformer Models: An Infinite Time-Aware Recommender System (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Addressing the challenge of matching queries with the right experts amid temporal-textual inconsistencies, we present a novel approach that combines an attention-based text embedding model with a continuous-time module. |
Mohsen Saaki; Saeid Hosseini; Sana Rahmani; Mohammad Reza Kangavari; Wen Hua; X. Zhou; |
| 482 | Contrastive Graph Representations for Logical Formulas Embedding (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a novel model of Contrastive Graph Representations (ConGR) for logical formulas embedding. |
Qika Lin; Jun Liu; Lingling Zhang; Yudai Pan; Xin Hu; Fangzhi Xu; Hongwei Zeng; |
| 483 | CUBE: Causal Intervention-based Counterfactual Explanation for Prediction Models (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate causal counterfactual explanation generation and propose CUBE, a causal intervention-based counterfactual explanation method. |
Xinyue Shao; Hongzhi Wang; Xiang Chen; Xiao Zhu; Yan Zhang; |
| 484 | Data Level Privacy Preserving: A Stochastic Perturbation Approach Based on Differential Privacy (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the great amount of available data, especially collected from the ubiquitous Internet of Things (IoT), the issue of privacy leakage has been an increasing concern recently. … |
Chuan Ma; Long Yuan; Li Han; Ming Ding; Raghav Bhaskar; Jun Li; |
| 485 | Incremental Graph Computation: Anchored Vertex Tracking in Dynamic Social Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we target a new research problem called Anchored Vertex Tracking (AVT), aiming to track the anchored users at each timestamp of evolving networks. |
Taotao Cai; Shuiqiao Yang; Jianxin Li; Quan Z. Sheng; Jian Yang; Xin Wang; Wei Emma Zhang; Longxiang Gao; |
| 486 | Pushing ML Predictions Into DBMSs (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the use of Relational Database Manage-ment Systems to reduce technical debt in Machine Learning de-ployments, specifically focusing on in-DBMS prediction serving. |
Matteo Paganelli; Paolo Sottovia; Kwanahvun Park; Matteo Interlandi; Francesco Guerra; |
| 487 | Searching Personalized K-wing in Bipartite Graphs (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study a new query-dependent bipartite cohesive subgraph search problem based on k-wing model. |
Aman Abidi; Lu Chen; Rui Zhou; Chengfei Liu; |
| 488 | Complex Event Summarization Using Multi-Social Attribute Correlation (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in many critical situations, social events are complex and context-sensitive, which demands the online summarization of social events in an integrated manner. Motivated by this, we propose an online complex social event summarization approach, namely SOMA, which summarizes the complex social events over multiple attributes including media content and contexts simultaneously. |
Xi Chen; Xiangmin Zhou; Jeffrey Chan; Lei Chen; Timos Sellis; Yanchun Zhang; |
| 489 | Efficient Community Search in Edge-Attributed Graphs (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed the Edge-Attributed Community Search (EACS) problem and proved that the EACS problem is NP-hard. |
Ling Li; Yuhai Zhao; Siqiang Luo; Guoren Wang; Zhengkui Wang; |
| 490 | GPU-Based Efficient Parallel Heuristic Algorithm for High-Utility Itemset Mining in Large Transaction Datasets (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, heuristic algorithms still face the problem of long runtime and insufficient mining quality, especially for large transaction datasets with thousands to tens of thousands of items and up to millions of transactions. To solve these problems, a novel GPU-based efficient parallel heuristic algorithm for HUIM (PHA-HUIM) is proposed in this paper. |
Wei Fang; Haipeng Jiang; Hengyang Lu; Jun Sun; Xiaojun Wu; Jerry Chun-Wei Lin; |
| 491 | An Investigation of SMOTE Based Methods for Imbalanced Datasets with Data Complexity Analysis (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This extended abstract highlights challenges with imbalanced datasets in real-world applications, where issues like noise, class overlap, and small subsets of data impact classification accuracy. |
Nur Athirah Azhar; Muhammad Syafiq Mohd Pozi; Aniza Mohamed Din; Adam Jatowt; |
| 492 | An Experimental Survey of Missing Data Imputation Algorithms (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We shed light on a series of constructive insights on imputation algorithms to tackle missing data problem in real-life scenarios. |
Xiaoye Miao; Yangyang Wu; Lu Chen; Yunjun Gao; Jianwei Yin; |
| 493 | Differentiable and Scalable Generative Adversarial Models for Data Imputation (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an effective scalable imputation system named SCIS to significantly speed up the training of the differentiable generative adversarial imputation models under accuracy-guarantees for large-scale incomplete data. |
Yangyang Wu; Jun Wang; Xiaoye Miao; Wenjia Wang; Jianwei Yin; |