Publications
Note: Authors marked with * are the corresponding authors. Papers marked with ** use alphabetic ordering of authors, following the convention of theoretical computer science.
Journal Articles
- When Transformer Meets Large Graphs: An Expressive and Efficient Two-View ArchitectureIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
-
- A Survey on Large Language Model Based Autonomous AgentsFrontiers of Computer Science (FCS), 2024[arXiv]
-
- Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024[arXiv]
-
- Enabling Efficient Random Access to Hierarchically Compressed Text Data on Diverse GPU PlatformsIEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
-
- Influence Maximization Revisited: Efficient Sampling with Bound TightenedACM Transactions on Database Systems (TODS), 2022
-
- Building Graphs at Scale via Sequence of Edges: Model and Generation AlgorithmsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2022
-
- ExactSim: Benchmarking Single-Source SimRank Algorithms with High-Precision Ground TruthsThe VLDB Journal, 2021
-
- Efficient Algorithms for Approximate Single-Source Personalized PageRank QueriesACM Transactions on Database Systems (TODS), 2019[arXiv]
-
- Parallel Trajectory-to-Location JoinIEEE Transactions on Knowledge and Data Engineering (TKDE), 2019
-
- Distribution-Aware Crowdsourced Entity CollectionIEEE Transactions on Knowledge and Data Engineering (TKDE), 2019
-
- Tight Space Bounds for Two-Dimensional Approximate Range CountingACM Transactions on Algorithms (TALG), 2018
-
- Optimal Algorithms for Selecting Top-k Combinations of Attributes: Theory and ApplicationsThe VLDB Journal, 2017
-
- Dynamic Shortest Path Monitoring in Spatial NetworksJournal of Computer Science and Technology (JCST), 2016
-
- Collective Travel Planning in Spatial NetworksIEEE Transactions on Knowledge and Data Engineering (TKDE), 2015
-
Conference Articles
- Intruding with Words: Towards Understanding Graph Injection Attacks at the Text LevelTo appear in Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
-
- S-MolSearch: 3D Semi-Supervised Contrastive Learning for Bioactive Molecule SearchTo appear in Annual Conference on Neural Information Processing Systems (NeurIPS), 2024[arXiv]
-
- SRAP-Agent: Simulating and Optimizing Scarce Resource Allocation Policy with LLM-Based AgentTo appear in Conference on Empirical Methods in Natural Language Processing (EMNLP Findings), 2024[arXiv]
-
- Beyond Over-smoothing: Uncovering the Trainability Challenges in Deep Graph Neural NetworksTo appear in The Conference on Information and Knowledge Management (CIKM), 2024[arXiv]
-
- Federated Heterogeneous Contrastive Distillation for Molecular Representation LearningTo appear in The Conference on Information and Knowledge Management (CIKM), 2024
-
- PolyFormer: Scalable Node-wise Filters via Polynomial Graph TransformerACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2024[arXiv]
-
- EquiPocket: an E(3)-Equivariant Geometric Graph Neural Network for Ligand Binding Site PredictionInternational Conference on Machine Learning (ICML), 2024. (Oral)[arXiv]
-
- Optimal Matrix Sketching over Sliding WindowsInternational Conference on Very Large Data Bases (VLDB), 2024. (Best Paper Nomination)
-
- HierAffinity: Predicting Protein-Ligand Binding Affinity With Hierarchical ModelingInternational Conference on Database Systems for Advanced Applications (DASFAA), 2024
-
- Learning-based Property Estimation with PolynomialsACM Conference on Management of Data (SIGMOD), 2024
-
- **Revisiting Local Computation of PageRank: Simple and OptimalAnnual ACM Symposium on Theory of Computing (STOC), 2024
-
- Exploring Neural Scaling Law and Data Pruning Methods For Node Classification on Large-scale GraphsThe Web Conference (TheWebConf), 2024. (Oral)
-
- Spectral Heterogeneous Graph Convolutions via Positive Noncommutative PolynomialsThe Web Conference (TheWebConf), 2024. (Oral)[arXiv]
-
- PolyGCL: GRAPH CONTRASTIVE LEARNING via Learnable Spectral Polynomial FiltersInternational Conference on Learning Representations (ICLR), 2024. (Spotlight)[Code]
-
- **Approximating Single-Source Personalized PageRank with Absolute Error GuaranteesInternational Conference on Database Theory (ICDT), 2024[arXiv]
-
- Do Deep Learning Methods Really Perform Better in Molecular Conformation Generation?International Conference on Learning Representations (ICLR), 2023. (MLDD Oral)[arXiv]
-
- Estimating Single-Node PageRank in \(\tilde{O}\left(\min\{d_t, \sqrt{m}\}\right)\)TimeInternational Conference on Very Large Data Bases (VLDB), 2023[arXiv]
-
- MGNN: Graph Neural Networks Inspired by Distance Geometry ProblemACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023
-
- Optimal Dynamic Subset Sampling: Theory and ApplicationsACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023[arXiv]
-
- Clenshaw Graph Neural NetworksACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023[arXiv]
-
- Graph Neural Networks with Learnable and Optimal Polynomial BasesInternational Conference on Machine Learning (ICML), 2023[arXiv]
-
- **On Range Summary QueriesInternational Colloquium on Automata, Languages and Programming (ICALP), 2023[arXiv]
-
- Decoupled Graph Neural Networks for Large Dynamic GraphsInternational Conference on Very Large Data Bases (VLDB), 2023[arXiv]
-
- Uni-Mol: A Universal 3D Molecular Representation Learning FrameworkInternational Conference on Learning Representations (ICLR), 2023
-
- Personalized PageRank on Evolving Graphs with an Incremental Index-Update SchemeACM Conference on Management of Data (SIGMOD), 2023[arXiv]
-
- Evennet: Ignoring Odd-Hop Neighbors Improves Robustness of Graph Neural NetworksAnnual Conference on Neural Information Processing Systems (NeurIPS), 2022
-
- Convolutional Neural Networks On Graphs With Chebyshev Approximation, RevisitedAnnual Conference on Neural Information Processing Systems (NeurIPS), 2022. (Oral)
-
- Approximating Probabilistic Group Steiner Trees in GraphsInternational Conference on Very Large Data Bases (VLDB), 2022
-
- MGMAE: Molecular Representation Learning by Reconstructing Heterogeneous Graphs with A High Mask RatioThe Conference on Information and Knowledge Management (CIKM), 2022
-
- Optimizing Random Access to Hierarchically-Compressed Data on GPUInternational Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2022
-
- Sampling-based Estimation of the Number of Distinct Values in Distributed EnvironmentACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022[arXiv]
-
- Graph Neural Networks with Node-wise ArchitectureACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022
-
- Instant Graph Neural Networks for Dynamic GraphsACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2022[arXiv]
-
- Edge-based Local Push for Personalized PageRankInternational Conference on Very Large Data Bases (VLDB), 2022[arXiv]
-
- Learning to be a Statistician: Learned Estimator for Number of Distinct ValuesInternational Conference on Very Large Data Bases (VLDB), 2021[arXiv]
-
- Bernnet: Learning Arbitrary Graph Spectral Filters via Bernstein ApproximationAnnual Conference on Neural Information Processing Systems (NeurIPS), 2021
-
- Approximate Graph PropagationACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021
-
- Graph Neural Networks Inspired by Classical Iterative AlgorithmsInternational Conference on Machine Learning (ICML), 2021. (Long Talk)[arXiv]
-
- Massively Parallel Algorithms for Personalized PagerankInternational Conference on Very Large Data Bases (VLDB), 2021
-
- Unifying the Global and Local Approaches: An Efficient Power Iteration with Forward PushACM Conference on Management of Data (SIGMOD), 2021[arXiv]
-
- FlashP: an Analytical Pipeline for Real-time Forecasting of Time-Series Relational DataInternational Conference on Very Large Data Bases (VLDB), 2021[arXiv]
-
- Scalable Graph Neural Networks via Bidirectional PropagationAnnual Conference on Neural Information Processing Systems (NeurIPS), 2020
-
- SimTab: Accuracy-Guaranteed SimRank Queries Through Tighter Confidence Bounds and Multi-Armed BanditsInternational Conference on Very Large Data Bases (VLDB), 2020
-
- Simple and Deep Graph Convolutional NetworksInternational Conference on Machine Learning (ICML), 2020. (World Artificial Intelligence Conference Youth Outstanding Paper Nomination Award)
-
- Personalized PageRank to a Target Node, RevisitedACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020
-
- Influence Maximization Revisited: Efficient Reverse Reachable Set Generation with Bound TightenedACM Conference on Management of Data (SIGMOD), 2020
-
- Exact Single-Source SimRank Computation on Large GraphsACM Conference on Management of Data (SIGMOD), 2020
-
- Crowdgame: A Game-Based Crowdsourcing System for Cost-Effective Data LabelingACM Conference on Management of Data (SIGMOD), 2019
-
- Scalable Graph Embeddings via Sparse Transpose ProximitiesACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019. (Oral)
-
- Efficient Estimation of Heat Kernel PageRank for Local ClusteringACM Conference on Management of Data (SIGMOD), 2019[arXiv]
-
- PRSim: Sublinear Time SimRank Computation on Large Power-Law GraphsACM Conference on Management of Data (SIGMOD), 2019
-
- Cost-Effective Data Annotation Using Game-Based CrowdsourcingInternational Conference on Very Large Data Bases (VLDB), 2018
-
- TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large GraphsACM Conference on Management of Data (SIGMOD), 2018
-
- Trajectory Similarity Join in Spatial NetworksInternational Conference on Very Large Data Bases (VLDB), 2017[Poster]
-
- ProbeSim: Scalable Single-Source and Top-k SimRank Computations on Dynamic GraphsInternational Conference on Very Large Data Bases (VLDB), 2017
-
- Collective Travel Planning in Spatial NetworksIEEE International Conference on Data Engineering (ICDE), 2017
-
- Tracking Matrix Approximation over Distributed Sliding WindowsIEEE International Conference on Data Engineering (ICDE), 2017
-
- FORA: Simple and Effective Approximate Single-Source Personalized PageRankACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2017
-
- Towards Maximum Independent Sets on Massive GraphsInternational Conference on Very Large Data Bases (VLDB), 2015
-
- **Equivalence between Priority Queues and Sorting in External MemoryEuropean Symposium on Algorithms (ESA), 2014
-
- **The Space Complexity of 2-Dimensional Approximate Range CountingACM-SIAM Symposium on Discrete Algorithms (SODA), 2013
-
- **Mergeable summariesACM Symposium on Principles of Database Systems (PODS), 2012. (Test of Time Award)
-
- **Beyond Simple Aggregates: Indexing for Summary QueriesACM Symposium on Principles of Database Systems (PODS), 2011[Slides]
-
- **Dynamic External Hashing: The Limit of BufferingACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 2009
-
Theses
- Classic and New Data Structure Problems in External MemoryHong Kong University of Science and Technology, 2012
-