Chengyue Jiang

PhD student at ShanghaiTech

About Me

Hi, my name’s Chengyue Jiang and I’m currently a 4th year PhD student at ShanghaiTech University. I majored in Computer Science, and I’m doing research focusing on Natural Language Processing (NLP), and I’m supervised by Prof. Kewei Tu.

More specifically, My NLP research interests are

  • use symbolic knowledge (e.g. Regular Expressions) to help NLP models reach better performance and have better interpretability on NLP tasks.
  • information extraction, such as entity typing and named entity recognition.
  • information retrieval, familiar with related techniques.
  • investigate the ontological knowledge in large pretrained language models.
I'm currently VERY INTERESTED at Large Language Models (LLM) and keeps exploring and trying related techniques (Instruction Tuning, Meta ICL, Lora, Self-instruct (Alpaca))
I'm open to the job market this Autumn, feel free to contact me through my email: jiangchy@shanghaitech.edu.cn

Education

  • I recieve my Bachelor degree in Computer Science in ShanghaiTech University, (June 2015 - June 2019).
  • And I’m now a PhD student in ShanghaiTech University, supervised by Prof. Kewei Tu.

Publication

ACL 2023

Chengyue Jiang, Wenyang Hui (Equal Contribution), Yong Jiang, Xiaobin Wang, Pengjun Xie, Kewei Tu “Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing”. (Paper)

ACL 2023 (Outstanding Paper Award)

Weiqi Wu, Chengyue Jiang, Yong Jiang, Xiaobin Wang, Pengjun Xie, Kewei Tu “Do PLMs Know and Understand Ontological Knowledge?”. (Paper)

EACL 2023

Chengyue Jiang, Yong Jiang, Weiqi Wu, Yuting Zheng, Pengjun Xie, Kewei Tu “COMBO: A Complete Benchmark for Open KG Canonicalization”. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. (Paper) (Code)

EMNLP 2022

Chengyue Jiang, Yong Jiang, Weiqi Wu, Zhongqiang Huang, Pengjun Xie, Kewei Tu “Modeling Label Correlations for Ultra-Fine Entity Typing with Neural Pairwise Conditional Random Field”. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. (Paper) (Code)

EMNLP 2021

Chengyue Jiang, Zijian Jin, Kewei Tu. “Neuralizing Regular Expressions for Slot Filling”. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. (Paper) (Code)

EMNLP 2020

Chengyue Jiang, Yinggong Zhao, Shanbo Chu, Libin Shen, and Kewei Tu. “Cold-start and Interpretability: Turning Regular Expressions into Trainable Recurrent Neural Networks”. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. (Paper) (Code)

Findings of EMNLP 2020

Chengyue Jiang, Zhonglin Nian, Kaihao Guo, Shanbo Chu, Yinggong Zhao, Libin Shen, and Kewei Tu. “Learning Neumeral Embeddings”. In Findings of the Association for Computational Linguistics: EMNLP 2020. (Paper) (Code)

CoNLL 2019

Xinyu Wang, Yixian Liu, Zixia Jia, Chengyue Jiang, and Kewei Tu. “ShanghaiTech at MRP 2019: Sequence-to-Graph Transduction with Second-Order Edge Inference for Cross-Framework Meaning Representation Parsing”. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL 2019). (Paper) (Poster)

Experience

NLP Research Intern

Mihoyo, Lumi NLP

June 2023 -- Present

Doing some exploration about large language model (LLM).

NLP Research Intern

Alibaba Damo Academy, Hangzhou

August 2021 -- June 2023

Doing some research about large language model (LLM) and information extraction. Act as the core developer of the entity typing part of the Adaseq Project in DAMO academy.

Software Intern

Honeywell, Nanjing

July 2018 -- September 2018

Doing some early research on using image captioning to help fault detection. Using simple sequence to sequence architecture, and implemented using Keras.

Web Intern

Boonray Technology, Shanghai

July 2016 -- September 2016

Software engineer internship. Participating in developing a webapp using Django backend and MySQL. The webapp is a simple cloud platform, used for managing costumers.

Projects

Adaseq

one of core contributors

Adaseq is the open source sequence understanding libarary presented by DAMO NLP. It contains SOTA Named Entity Recognition, Entity Typing, Relation Extraction models. I act as the core developer of the entity typing part of the Adaseq Project in DAMO academy.
Implemente and replicate entity typing models including multi-label MLC, NPCRF, MCCE, Prompt Learning, etc.

RegExp to Neural Network

author

The source codes(RE2NN,RE2NN-SEQ) of my papers “Cold-start and Interpretability: Turning Regular Expressions into Trainable Recurrent Neural Networks” and “Neuralizing Regular Expressions for Slot Filling”, reaches over 130 stars on GitHub. The repos change regular expressions for text classification and slot filling into neural networks.

Services

  • ACL2023 EACL2023 AACL2022 ICLR2022