Dongkuan (DK) Xu / 胥栋宽

Hello! I am a Ph.D. student at Penn State, where I work on machine learning, natural language processing, and data mining, advised by Xiang Zhang. I received my M.S. in Optimization at the University of Chinese Academy of Sciences, where I was advised by Yingjie Tian. I received my B.E. at the Renmin University of China, advised by Wei Xu.

In summer 2021, I was a research intern at Microsoft Research working with Subho, Xiaodong, Dey, Ahmed, and Jianfeng, exploring neural architecture search for efficient Transformer models. In 2020, I was an intern research scientist at Moffett AI working with Ian En-Hsu Yen, studying model pruning and few-shot knowledge distillation. I also spent two wonderful summers (2019, 2018) as a research intern working with Wei Cheng at NEC Labs America, researching on contrastive learning and multi-task learning.

Other than my work, I am a big fan of American football. I love Nittany Lions and New York Giants. I also like workout, soccer ball, and hotpot.

Email  /  CV (Nov. 2021)  /  Twitter  /  Google Scholar  /  LinkedIn   

(I'm on the job market this year. Please feel free to contact me.)   

profile photo
Research

I am interested in efficient AI, including parameter efficiency, data efficiency, and computation efficiency. My current research is investigating how we can improve the efficiency of deep learning systems to achieve the Pareto optimality between resources (e.g., parameter, data, computation) and performance (e.g., inference, training). My long-term research goal is to free AI from the parameter-data-computation hungry beasts, and democratize AI to serve a broader area and population.

  • Parameter Efficiency: Neural Architecture Search, Pruning, Network Modularization

  • Data Efficiency: Few-shot Learning, Contrastive Learning, Generator Learning

  • Computation Efficiency: Weight-sharing Learning, Reduced-cost Training, Trainingless Proxies

  • Model Architectures: Transformers, Temporal Networks, Graph Neural Networks

News
  • 11/2021: Invited to serve as a PC member for KDD'22.
  • 11/2021: Invited to give a talk "Be Careful with Pruning: Underfitting vs. Overfitting" at Brandeis University.
  • 10/2021: Our ML&NLP academic community is officially launched (>500k followers).
  • 10/2021: Received IST Fall 2021 Travel Award.
  • 09/2021: Our work, InfoGCL, was accepted to NeurIPS'21!
  • 08/2021: Invited to serve as PC member for AAAI'22, ACL Rolling Review'22., SDM'22.
  • 07/2021: Received complimentary ACM student membership. Thanks you ACM!.
  • 06/2021: Invited to serve as a PC member for ICLR'22, WSDM'22, IJCAI-ECAI'22.
  • 05/2021: Received NAACL 2021 Scholarship.
  • 05/2021: One paper was accepted to ACL'21!
  • 05/2021: Excited to join Microsoft Research as a research intern working on neural architecture search!
  • 04/2021: Gave a talk titled "BERT Pruning: Structural vs. Sparse" at Brandeis University (slides).
  • 04/2021: Gave a talk titled "BERT, Compression and Applications" at Xpeng Motors (小鹏汽车) (slides).
  • 04/2021: Invited to serve as a PC member for NeurIPS'21.
  • 03/2021: My application to SDM'21 Doctoral Forum has been accepted. See you in May!
  • 03/2021: Received a SIAM Student Travel Award to attend SDM'21.
  • 03/2021: Our work, SparseBERT, was accepted to NAACL'21! Along with three U.S. patent applications!
  • 03/2021: Invited to serve as a PC member for EMNLP'21, CIKM'21.
  • 03/2021: Received IST Spring 2021 Travel Award.
  • 12/2020: One paper was accepted to SDM'21. See you virtually in April!
  • 12/2020: Invited to serve as a Senior PC member for IJCAI'21.
  • 12/2020: Four papers were accepted to AAAI'21. See you virtually in February!
  • 12/2020: Invited to serve as a PC member for ICML'21, KDD'21, NAACL'21, IJCNN'21.
  • 09/2020: Our work, PGExplainer, was accepted to NeurIPS'20.
  • 09/2020: Invited to serve as a journal reviewer for Information Fusion.
  • 08/2020: Invited to serve as a PC member for AAAI'21, EACL'21.
  • 08/2020: Received KDD 2020 Student Registration Award.
  • 06/2020: Invited to serve as a reviewer for NeurIPS'20.
  • 05/2020: Happy to join Moffett AI as an intern research scientist.
  • 04/2020: One paper was accepted to SIGIR'20.
  • 04/2020: Invited to serve as a PC member for KDD'20 (Research Track \& Applied Science Track).
  • 03/2020: Invited to serve as a PC member for EMNLP'20, CIKM'20, AACL-IJCNLP'20.
  • 02/2020: Received IST Spring 2020 Travel Award.
  • 12/2019: Invited to serve as a PC member for IJCAI'20, IJCNN'20.
  • 12/2019: Received AAAI 2020 Student Scholarship.
  • 11/2019: Two papers were accepted to AAAI'20. See you in the Big Apple!
  • 08/2019: Invited to serve as a PC member for AAAI'20.
  • 08/2019: One paper was accepted to ICDM'19.
  • 05/2019: One paper was accepted to IJCAI'19.
  • 05/2019: Happy to join NEC Labs America as a research intern.
  • 03/2019: Received IST Spring 2019 Travel Award.
  • 01/2019: Grateful to receive The Award for Excellence in Teaching, IST (News).
  • 01/2019: Invited to serve as a PC member for IJCNN'19.
  • 12/2018: One paper was accepted to SDM'19. See you in Calgary!
  • 05/2018: Started working at NEC Labs America as a research intern.
  • 11/2017: Invited to serve as a PC member for IJCNN'18.
Teaching Experiences
  • Guest Lecturer
    • COSI 133A: Graph Mining
      Brandeis University, 2021 Fall

    • COSI 165B: Deep Learning
      Brandeis University, 2021 Spring

Collaborated/Supervised Students
  • Shaoyi Huang, Ph.D. at University of Connecticut
    Topic I: Sparse Neural Architecture Search
    Topic II: Few-shot BERT Distillation

  • Tianxiang Zhao, Ph.D. at Penn State University
    Topic: Graph Transfer Learning

  • Zhenglun Kong, Ph.D. at Northeastern University
    Topic: Efficient Auto Vision Transformer Search

  • Shanglin Zhou, Ph.D. at University of Connecticut
    Topic: Data-free Model Compression

  • Bowen Lei, Ph.D. at Texas A&M University
    Topic: Robust Sparse Neural Network Training

  • Wei Zhang, Undergraduate at Renmin University of China
    (Now Ph.D. at City University of Hong Kong)
    Topic: Cost-Sensitive Multi-Instance Learning

Publications

2021

3DSP InfoGCL: Information-Aware Graph Contrastive Learning (to appear)
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, Xiang Zhang
[NeurIPS 2021] The 35th Conference on Neural Information Processing Systems
Code / Supp / Slides

We propose an information-aware contrastive learning framework for graph-structure data, and show for the first time that all recent graph contrastive learning methods can be unified by our framework.

3DSP (SparseBERT) Rethinking Network Pruning - under the Pre-train and Fine-tune Paradigm
Dongkuan Xu, Ian En-Hsu Yen, Jinxi Zhao, Zhibin Xiao
[NAACL-HLT 2021] 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Code / Supp / Slides

We study how knowledge is transferred and lost during the pre-train, fine-tune, and pruning process, and propose a knowledge-aware sparse pruning process that achieves significantly superior results than existing literature.

3DSP Data Augmentation with Adversarial Training for Cross-Lingual NLI
Xin Dong, Yaxin Zhu, Zuohui Fu, Dongkuan Xu, Gerard de Melo
[ACL-IJCNLP 2021] The 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
Code / Supp / Slides

We study data augmentation for cross-lingual natural language inference and propose two methods of training a generative model to induce synthesized examples to reflect more diversity in a semantically faithful way.

3DSP Deep Multi-Instance Contrastive Learning with Dual Attention for Anomaly Precursor Detection
Dongkuan Xu, Wei Cheng, Jingchao Ni, Dongsheng Luo, Masanao Natsumeda, Dongjin Song, Bo Zong, Haifeng Chen, Xiang Zhang
[SDM 2021] The 21th SIAM International Conference on Data Mining
Code / Supp / Slides

We utilize multi-instance learning to model the uncertainty of precursor period, and design a contrastive loss to address the issue that annotated anomalies are few.

3DSP Multi-Task Recurrent Modular Networks
Dongkuan Xu, Wei Cheng, Xin Dong, Bo Zong, Wenchao Yu, Jingchao Ni, Dongjin Song, Xuchao Zhang, Haifeng Chen, Xiang Zhang
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
Code / Supp / Slides

We propose MT-RMN to dynamically learn task relationships and accordingly learn to assemble composable modules into complex layouts to jointly solve multiple sequence processing tasks.

3DSP Transformer-Style Relational Reasoning with Dynamic Memory Updating for Temporal Network Modeling
Dongkuan Xu, Junjie Liang, Wei Cheng, Hua Wei, Haifeng Chen, Xiang Zhang
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
Code / Supp / Slides

We propose TRRN to model temporal networks by employing transformer-style self-attention to reason over a set of memories.

3DSP How Do We Move: Modeling Human Movement with System Dynamics
Hua Wei, Dongkuan Xu, Junjie Liang, Zhenhui Li
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
Code / Supp / Slides

We propose MoveSD to model state transition in human movement from a novel perspective, by learning the decision model and integrating the system dynamics.

3DSP Longitudinal Deep Kernel Gaussian Process Regression
Junjie Liang, Yanting Wu, Dongkuan Xu, Vasant Honavar
[AAAI 2021] The 35th AAAI International Conference on Artificial Intelligence
Code / Supp / Slides

We introduce Longitudinal deep kernel Gaussian process regression to fully automate the discovery of complex multi level correlation structure from longitudinal data.

2020

3DSP Parameterized Explainer for Graph Neural Network
Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, Xiang Zhang
[NeurIPS 2020] The 34th Conference on Neural Information Processing Systems
Code / Supp / Slides

We propose to adopt deep neural networks to parameterize the generation process of explanations, which enables a natural approach to multi-instance explanations.

3DSP Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification
Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen Yang, Gerard de Melo
[SIGIR 2020] The 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
Code / Supp / Slides

We propose a semi-supervised adversarial perturbation framework that encourages the model to be more robust towards such divergence and better adapt to the target language.

3DSP Tensorized LSTM with Adaptive Shared Memory for Learning Trends in Multivariate Time Series
Dongkuan Xu, Wei Cheng, Bo Zong, Dongjin Song, Jingchao Ni, Wenchao Yu, Yanchi Liu, Haifeng Chen, Xiang Zhang
[AAAI 2020] The 34th AAAI International Conference on Artificial Intelligence
Code / Poster / Slides

We propose a deep architecture for learning trends in multivariate time series, which jointly learns both local and global contextual features for predicting the trend of time series.

3DSP Longitudinal Multi-Level Factorization Machines
Junjie Liang, Dongkuan Xu, Yiwei Sun, Vasant Honavar
[AAAI 2020] The 34th AAAI International Conference on Artificial Intelligence
Code / Supp

We propose longitudinal kulti-level factorization machine, to the best of our knowledge, the first model to address these challenges in learning predictive models from longitudinal data.

2019

3DSP Adaptive Neural Network for Node Classification in Dynamic Networks
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Yameng Gu, Xiao Liu, Jingchao Ni, Bo Zong, Haifeng Chen, Xiang Zhang
[ICDM 2019] The 19th IEEE International Conference on Data Mining
Slides

We propose an adaptive neural network for node classification in dynamic networks, which is able to consider the evolution of both node attributes and network topology.

3DSP Spatio-Temporal Attentive RNN for Node Classification in Temporal Attributed Graphs
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Xiao Liu, Xiang Zhang
[IJCAI 2019] The 29th International Joint Conference on Artificial Intelligence
Code / Poster / Slides

We propose a spatio-temporal attentive RNN model, which aims to learn node representations for classification by jointly considering both the temporal and spatial patterns of the node.

3DSP Deep Co-Clustering
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Xiao Liu, Xiang Zhang
[SDM 2019] The 19th SIAM International Conference on Data Mining
Code / Supp / Poster / Slides

DeepCC utilizes the deep autoencoder for dimension reduction, and employs a variant of Gaussian mixture model to infer the cluster assignments. A mutual information loss is proposed to bridge the training of instances and features.

2018

3DSP Co-Regularized Deep Multi-Network Embedding
Jingchao Ni, Shiyu Chang, Xiao Liu, Wei Cheng, Haifeng Chen, Dongkuan Xu and Xiang Zhang
[WWW 2018] The 27th International Conference on World Wide Web
Code

DMNE coordinates multiple neural networks (one for each input network data) with a co-regularized loss function to manipulate cross-network relationships, which can be many-to-many, weighted and incomplete.

3DSP Multiple Instance Learning Based on Positive Instance Graph
Dongkuan Xu, Wei Zhang, Jia Wu, Yingjie Tian, Qin Zhang, Xindong Wu
arXiv preprint

Most multi-instance learning (MIL) methods that study true positive instances ignore 1) the global similarity among positive instances and 2) that negative instances are non-i.i.d.. We propose a MTL method based on positive instance graph updating to address this issue.

3DSP A Review of Multi-Instance Learning Research
Yingjie Tian, Dongkuan Xu, Chunhua Zhang
Operations Research Transactions, 2018

This paper reviews the research progress of multi-instance learning (MTL), introduces different assumptions, and categories MTL methods into instance-level, bag-level, and embedded-space. Extensions and major applications in various areas are discussed at last.

2017

3DSP SALE: Self-Adaptive LSH Encoding for Multi-Instance Learning
Dongkuan Xu, Jia Wu, Dewei Li, Yingjie Tian, Xingquan Zhu, Xindong Wu
Pattern Recognition, 2017

We propose a self-adaptive locality-sensitive hashing encoding method for multi-instance learning (MIL), which efficiently deals with large MIL problems.

3DSP Metric Learning for Multi-Instance Classification with Collapsed Bags
Dewei Li, Dongkuan Xu, Jingjing Tang, Yingjie Tian
[IJCNN 2017] The 30th IEEE International Joint Conference on Neural Networks

We propose a metric learning method for multi-instance classification, aiming to find an instance-dependent metric by maximizing the relative distance on neighborhood level.

2016

3DSP PIGMIL: Positive Instance Detection via Graph Updating for Multiple Instance Learning
Dongkuan Xu, Jia Wu, Wei Zhang, Yingjie Tian
arXiv preprint arXiv:1612.03550, 2016

We propose a positive instance detection method based on multiple instance learning, of which the core idea is that true positive instances should not only be similar to themselves globally but also different from negative instances robustly.

3DSP Multi-Metrics Classification Machine
Dewei Li, Wei Zhang, Dongkuan Xu, Yingjie Tian
[ITQM 2016] The 4th International Conference on Information Technology and Quantitative Management
(Best Paper Award)

We propose a metric learning approach called multi-metrics classification machine. We establish an optimization problem for each class (each metric) to learn multiple metrics independently.

2015

3DSP A Comprehensive Survey of Clustering Algorithms
Dongkuan Xu, Yingjie Tian
Annals of Data Science, 2015
(891 citations)

We introduce the definition of clustering, the basic elements involved in clustering process, and categorize the clustering algorithms into the traditional ones and the modern ones. All the algorithms are discussed comprehensively.

Undergraduate

3DSP A Support Vector Machine-based Ensemble Prediction for Crude Oil Price with VECM and STEPMRS
Dongkuan Xu, Tianjia Chen, Wei Xu
International Journal of Global Energy Issues, 2015

This paper proposes a support vector machine-based ensemble model to forecast crude oil price based on VECM and stochastic time effective pattern modelling and recognition system (STEPMRS).

3DSP A Neural Network-Based Ensemble Prediction Using PMRS and ECM
Dongkuan Xu, Yi Zhang, Cheng Cheng, Wei Xu, Likuan Zhang
[HICSS 2014] The 47th Hawaii International Conference on System Science

This paper presents an integrated model to forecast crude oil prices, where pattern modelling & recognition system is used to model the price trend and error correction model is offered to forecast errors. A neural network layer is employed to integrate the results.

Professional Services
  • Academic Committee Member:
    • MLNLP
  • Senior Program Committee Member:
    • IJCAI'21
  • Program Committee Member:
    • ICLR'21, 22
    • ICML'21
    • NeurIPS'20, 21
    • AAAI'20, 21, 22
    • KDD'20, 21 (Research Track), KDD'20, 22 (Applied Science Track)
    • ACL Rolling Review'22
    • IJCAI'20, 22
    • NAACL'21
    • EMNLP'20, 21
    • WSDM'22
    • SDM'22
    • EACL'21
    • ACM CIKM'20, 21
    • AACL-IJCNLP'20
    • IJCNN'18, 19, 20, 21
  • Journal Reviewer:
    • IEEE Transactions on Knowledge and Data Engineering (TKDE)
    • IEEE Transactions on Cybernetics
    • Information Fusion
    • ACM Transactions on Knowledge Discovery from Data (TKDD)
    • Pattern Recognition
    • Neural Networks
    • ACM Transactions on Asian and Low-Resource Language Information Processing
    • IEEE Access
    • Neural Computation
    • Complexity
    • Soft Computing
    • Complex & Intelligent Systems
    • Multimedia Tools and Applications
    • Big Data
  • External Conference Reviewer:
    • AAAI'18, 19, 20, KDD'18, 19, 20, 21, TheWebConf (WWW)'20, 21, 22, WSDM'20, 21, ICDM'18, 19, 21, SDM'18, 19, 20, 21, 22, ACM CIKM'18, 19, Big Data'18, IJCNN'16, 17, ITQM'16, 17
  • Conference Volunteer:
    • The Annual Conference of NAACL-HLT, 2021
    • Backuping SDM Session Chairs, 2021
    • The 35th AAAI Conference on Artificial Intelligence, 2021
    • The 26th SIGKDD Conference on Knowledge Discovery and Data Mining, 2020
Patent Applications
  • System and Method for Knowledge-Preserving Neural Network Pruning.
    Enxu Yan, Dongkuan Xu, and Zhibin Xiao.
    U.S. Patent App. Mar. 2021.

  • Neural Network Pruning Method and System via Layerwise Analysis.
    Enxu Yan, Dongkuan Xu, and Jiachao Liu.
    U.S. Patent App. 17/107,046. Nov. 2020.

  • Bank-Balanced Sparse Activation Feature Maps for Neural Network Models.
    Enxu Yan, Dongkuan Xu, and Jiachao Liu.
    U.S. Patent App. Sep. 2020.

  • Unsupervised Multivariate Time Series Trend Detection for Group Behavior Analysis.
    Wei Cheng, Haifeng Chen, Jingchao Ni, Dongkuan Xu, and Wenchao Yu.
    U.S. Patent App. 16/987,734. Mar. 2021.

  • Tensorized LSTM with Adaptive Shared Memory for Learning Trends in Multivariate Time Series.
    Wei Cheng, Haifeng Chen, Jingchao Ni, Dongkuan Xu, and Wenchao Yu.
    U.S. Patent App. 16/987,789. Mar. 2021.

  • Adaptive Neural Networks for Node Classification in Dynamic Networks.
    Wei Cheng, Haifeng Chen, Wenchao Yu, and Dongkuan Xu.
    U.S. Patent App. 16/872,546. Nov. 2020.

  • Spatio Temporal Gated Recurrent Unit.
    Wei Cheng, Haifeng Chen, and Dongkuan Xu.
    U.S. Patent App. 16/787,820. Aug. 2020.

  • Automated Anomaly Precursor Detection.
    Wei Cheng, Dongkuan Xu, Haifeng Chen, and Masanao Natsumeda.
    U.S. Patent App. 16/520,632. Feb. 2020.

Talks
  • Chasing Efficiency of Pre-trained Language Models
    Redmond, Washington, USA, Jun. 2021.
    Microsoft Research Lab.

  • BERT Pruning: Structural vs. Sparse (Slides)
    Waltham, MA, USA, Apr. 2021.
    Brandeis University.

  • BERT, Compression and Applications (Slides)
    Mountain View, USA, Apr. 2021.
    Xpeng Motors.

  • BERT Architecture and Computation Analysis (Slides)
    Los Altos, USA, May. 2020.
    Moffett.AI.

  • Learning Trends in Multivariate Time Series (Slides)
    New York, USA, Feb. 2020.
    AAAI 2020.

  • Node Classification in Dynamic Networks (Slides)
    Beijing, China, Nov. 2019.
    ICDM 2019.

  • Anomaly Precursor Detection via Deep Multi-Instance RNN (Slides)
    Princeton, USA, May. 2019.
    NEC Laboratories America.

  • Deep Co-Clustering (Slides)
    Calgary, Canada, May 2019.
    SDM 2019.

  • Efficient Multiple Instance Learning (Slides)
    Princeton, USA, May. 2018.
    NEC Laboratories America.

Honors and Awards
  • Doctor of Philosophy (Ph.D.)
    • College of IST Award for Excellence in Teaching Support (top 2), 2019
    • NAACL Scholarship, 2021
    • SIAM Student Travel Award, 2021
    • IST Travel Awards, Spring 2021, Fall 2021
    • College of IST Award for Excellence in Teaching Support, Finalist, 2021
    • KDD Student Registration Award, 2020
    • AAAI Student Scholarship, 2020
    • IST Travel Award, Spring 2020
    • IST Travel Award, Spring 2019
  • Master of Science (M.S.)
    • ITQM Best Paper, 2016
    • President’s Fellowship of Chinese Academy of Sciences (the most prestigious award), 2016
    • National Graduate Scholarship, China (2% in university), 2016
    • Graduate Student Academic Scholarship, 2017
    • Graduate Student Academic Scholarship, 2016
    • Graduate Student Academic Scholarship, 2015
  • Bachelor of Engineering (B.E.)
    • First-class Scholarship of Sashixuan Elite Fund, China (5% in university), 2014
    • Kwang-hua Scholarship of RUC, China, 2014
    • Second-class Scholarship of Excellent Student Cadre, 2014
    • Meritorious Winner in Mathematical Contest in Modeling, 2013
    • First-class Scholarship of Social Work and Volunteer Service of RUC, 2013
Extracurricular Activities
  • ACM (Association for Computing Machinery) Student Membership, 2021-Present
  • ACL (Association for Computational Linguistics) Membership, 2021-Present
  • AAAI (Association for the Advancement of Artificial Intelligence) Student Membership, 2019-Present
  • SIAM (Society of Industrial and Applied Mathematics) CAS Student Member, 2016-Present
  • President of Youth Volunteers Association of School of Information of RUC, 2012-2013
  • Volunteer of Beijing Volunteer Service Federation (BVF), 2012-Present
  • Leader of National Undergraduate Training Programs for Innovation and Entrepreneurship, 2011-2012


*Last updated on 11/23/2021*
This guy makes a nice webpage