Peidong Liu

[Google Scholar]

Advanced Algorithm Engineer in DJI Automotive

Shenzhen, Guangdong, China.

moc.liamg@uil.nodrep :liamE

Biography

I am currently a lead for the visual-language-action model (VLA) at the DJI Automotive Perception Group and especially focus on fine-tuning VLA to address the challenges posed by long-tailed scenarios. Before that, I was primarily responsible for the Bird's Eye View (BEV) lane detection and large-scale multimodal retrieval systems. If you are interested in an internship opportunity, please feel free to drop me an email.

I obtained my M.S. in Computer Science from Tsinghua University in 2022, as an outstanding graduate. I have been fortunate to closely work with Prof. Xiaodan Liang at Sun Yat-sen University, Dr. Hang Xu at Huawei Noah's Ark Lab, Dr. Litong Feng and Dr. Xinjiang Wang at SenseTime Research. I received my B.S. in Software Engineering from Sun Yat-sen University summa cum laude in 2019. My research interest lies in computer vision and visual-language model.

News

(2025-04) The VLA capabilities are publicly demonstrated at the 2025 Shanghai Auto Show (see wechat link for details).
(2024-01) I am awarded 2023 Annual Efficiency Vanguard Award at DJI Automotive for the outstanding contributions.
(2022-07) Two of our works are accepted by ECCV2022.
(2022-06) I am awarded both University-wise (Top 1%) and Department-wise (Top 5%) Outstanding Graduate at Tsinghua University.
(2022-06) I am awarded Outstanding Master's Thesis Award at Tsinghua University (Top 5%).
(2021-10) I am awarded National Scholarship for Postgraduate at Tsinghua University (Top 1%).
(2021-07) Our work is accecpted by ACM MM2021 as an Oral paper.
(2021-02) I am invited to give a talk about our ICLR2021 paper in QingYuan (青源 in Chinese) Seminar, organized by Beijing Academy of Artificial Intelligence (BAAI). Thanks for SenseTime's invitation. Please see more details here.
(2021-01) Our paper is accecpted by ICLR2021. The first Autoloss work for object detection. The code is released here.

Publications

* denotes equal contribution.

In Submission

SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation [PDF]

Yanjie Li, Sen Yang, Peidong Liu, Shu-Tao Xia

European Conference on Computer Vision (ECCV), 2022

NeXT: Towards High Quality Neural Radiance Fields via Multi-Skip Transformer [PDF]

Yunxiao Wang, Yanjie Li, Peidong Liu, Tao Dai, Shu-Tao Xia

European Conference on Computer Vision (ECCV), 2022

Multi-task Ranking with User Behaviors for Text-Video Search [PDF]

Peidong Liu, Dongliang Liao, Jinpeng Wang, Yangxin Wu, Gongfu Li, Shu-Tao Xia, Jin Xu

International World Wide Web Conferences (WWW, CCF-A) Companion, 2022

Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search [PDF] [Talk] [Poster] [PPT] [Code]

Peidong Liu*, Gengwei Zhang*, Bochao Wang, Hang Xu, Xiaodan Liang, Yong Jiang, Zhenguo Li

International Conference on Learning Representations (ICLR), 2021.

WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations [PDF]

Peidong Liu*, Zibin He*, Xiyu Yan*, Yong Jiang, Shu-Tao Xia, Feng Zheng, Maowei Hu

ACM International Conference on Multimedia (ACM MM, CCF-A) Oral, 2021.

Visual Privacy Protection via Mapping Distortion [PDF] [Code]

Yiming Li*, Peidong Liu*, Yong Jiang, Shu-Tao Xia

International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021

Deep Flow Collaborative Network for Online Visual Tracking [PDF]

Peidong Liu, Xiyu Yan, Yong Jiang, Shu-Tao Xia

International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020

LDA Meets Word2Vec: A Novel Model for Academic Abstract Clustering [PDF]

Changzhou Li, Yao Lu, Junfeng Wu, Yongrui Zhang, Zhongzhou Xia, Tianchen Wang, Dantian Yu, Xurui Chen, Peidong Liu, Junyu Guo

International World Wide Web Conferences (WWW, CCF-A) Companion, 2018

Selected Awards

2024.01 DJI Automotive 2023 Annual Efficiency Vanguard Award
2022.06 University-wise (Top 1%) and Department-wise (Top 5%) Outstanding Graduate at Tsinghua University
2022.06 Outstanding Master's Thesis Award at Tsinghua University (Top 5%)
2021.10 National Scholarship for Postgraduate (Top 1%)
2019.06 Outstanding Graduate of Sun Yat-sen University (Top 3%)
2018.10 Second Class Academic Scholarship of Sun Yat-sen University (Top 8%)
2017.10 Bronze Award in Intel Cup – Parallel Application Challenge (PAC) 2017, China (Top 6%)
2017.10 First Class Academic Scholarship of Sun Yat-sen University (Top 3%)
2017.01 Honorable Mention in Interdisciplinary Contest in Modeling
2016.10 First Class Academic Scholarship of Sun Yat-sen University (Top 3%)

Research Experience in both Academic and Industry

2022.07 - till now	Perception Group, DJI Automotive Advanced Computer Vision Algorithm Engineer Responsible for applying Visual-Language-Action (VLA) models in autonomous driving from scratch, which encompasses an extensive survey of open-source data and models, the establishment of data annotation and model fine-tuning pipeline. This includes the design of prompts, LoRA fine-tuning, and the utilization of Deepspeed for multi-node training. By leveraging both open-source and proprietary datasets, the model has been fine-tuned to possess capabilities in perception, decision-making, and planning. The VLA realizes two types of capabilities: understanding and decision-making for long-tail scenarios and voice control for ambiguous commands Developed the multimodal large model in image-text retrieval, focusing on mining long-tail data of user interest within massive video databases, with the aim of empowering autonomous driving applications. Specifically, I leverage LLM (Large Language Model) and Diffusion Model to generate synthetic data, augmenting the existing dataset and enhancing the model's performance Developed multiple Bird-Eye-View (BEV) lane detection solutions, including temporal BEV, fisheye BEV and road topology, aiming to facilitate the implementation of various solutions in mass production projects (I achieved great performance as a result of my accomplishments) Optimized Closed-loop lane detection data recycle process, which involves the entire cycle of data collection, feedback, filtering, reconstruction, annotation, etc. This comprehensive approach has greatly improved data recycle efficiency by over 100% through streamlining coordination and collaboration among several modules (I was awarded the 2023 Annual Efficiency Vanguard Award at DJI Automotive due to my outstanding contributions)


2019.09 - 2022.06	Department of Computer Science and Technology, Tsinghua University Master Student Supervisor: Shu-Tao Xia Proposed Memory Flow Distillation, called MFD, for video semantic segmentation. MFD utilizes weakly-supervised training pattern, optical flow and distillation to alleviate two issues: ﬁne-annotation scarcity and low inference speed. For PSPNet MobileNetV2, MFD increases the performance by 10.24% mIoU and reaches a real-time speed (ACM MM2021 Oral) Proposed a Flow Collaborative Network, called DFCNet, for online visual tracking. DFCNet only runs the complex feature network on sparse keyframes, which is selected by raised adaptive keyframe scheduling. DFCNet maximizes the beneﬁts of both feature appearance and temporal information and reaches 30% faster than baseline without compromising accuracy (ICASSP2020)


2021.06 - 2022.05	Search Application Department, WeChat Group, Tencent Computer Vision Algorithm Engineer Intern To address the challenges of low click-through rates and completion rates in video retrieval for WeChat Channel, as well as the issue of imprecise query-item matching, we defined a new problem: multi-target ranking for video retrieval. By extracting 800k query-document pairs from user interaction logs and employing a multimodal fusion model combined with the MMOE framework as the baseline, the research modeled multiple objectives and achieved a 3% improvement in the average AUC-ROC for each objective


2020.04 - 2021.03	Noah's Ark Lab, Huawei Research Intern Mentor: Xiaodan Liang, Hang Xu, Bochao Wang Proposed an effective convergence-simulation driven evolutionary search algorithm, called CSE-Autoloss, for object detection loss function discovery, which achieves 20x speedup via progressive convergence-simulation modules (ICLR2021)


2019.07 - 2019.09	Y-Tech AI Lab, Beijing Kuaishou Technology Ltd. AI Intern Improved face parsing task with landmarks by around 2% in accuracy on baseline model UNet


2018.11 - 2019.06	Fundamental Technique Research Group, SenseTime Research Research Intern Mentor: Litong Feng (Senior Researcher, Ph.D.) Solely responsible for building the entire pipeline for converting pytorch models to caffe models, including models for classification (Resnet, Inception Resnet series) and Object Detection (SSD, Faster Rcnn), etc.


2018.03 - 2018.05	NUS-Tsinghua Center for Extreme Search(NExT++), NUS, Singapore Research Assistant Mentor: Zhaoyan Ming (Ph.D., Team Head, NExT++) Implemented an algorithm to classify Southeast Asian food with complex names and meanings


2017.10 - 2018.02	Smart Mobile Computing Lab, Advanced Networking and Computing Systems Institute, SYSU Research Assistant Mentor: Xu Chen (Professor, School of Data and Computer Science, SYSU) Engaged in modeling 30GB articles of WeChat Moment with effective structural features Applied Logistic Regression, Random Forest and GBDT to predict the information growth


2017.07 - 2017.10	Natural Language Processing Group, Guangdong Province Key Laboratory of Computational Science, SYSU Research Assistant Mentor: Yao Lu (Professor, School of Data and Computer Science, SYSU) Participated in text analysis and text mining of medical scientific literature, including preprocessing, word vector representation with Word2Vec, vector dimension reduction with PCA, keywords obtained via TF-IDF, topic number analysis via AP algorithm and article topics obtained via LDA Realized parallelization with Spark

Academic Service

Conference Reviewer for AAAI 2022, WWW 2022.

Education

2019.09 - 2022.06, master student of Department of Computer Science and Technology at Tsinghua University

2015.09 - 2019.06, undergraduate student of School of Computer Science and Engineering at Sun Yat-sen University, rank 3/119

2018.01 - 2018.05, exchange student of School of Computing at National University of Singapore, research intern in NExT++ lab