Pinzhen Chen
I am a Research Associate on the High Performance Language
Technologies project and a final-year PhD student in the School of Informatics at the University of
Edinburgh, supervised by Kenneth Heafield and Barry Haddow. I am a member of the large
machine translation group, EdinburghNLP, and Institute for Language, Cognition and Computation.
I currently work on the applications of large language models in multilingual scenarios and machine
translation.
I also go by Patrick or 陈品桢 in Chinese. Last updated on 15 Mar 2024.
[pinzhen.chen@ed.ac.uk
| Semantic Scholar
| Google Scholar
| GitHub
| Hugging Face
| LinkedIn]
Experience
- 2024-present, University of Edinburgh, Research Associate
- 2020-present, University of Edinburgh, PhD student, expected 2024
- 2023, Microsoft Research Asia, Research Visit
- 2022, Huawei Noah's Ark Lab, Research Scientist Intern
- 2019, University of Edinburgh, Research Assistant
- 2015-2019, University of Edinburgh, BEng Artificial Intelligence and Software Engineering. Awarded first
class honours and a Class Medal for attaining the top performance in the degree
- 2018, Goldman Sachs, Technology Analyst Intern
Services
- Program Committee/Reviewer:
- Conference on Language Modeling (COLM): 2024
- Association for Computational Linguistics Rolling Review (ARR): 2021, 2023, 2024
- Joint Conference on Lexical and Computational Semantics (*SEM): 2022, 2023, 2024
- Financial Support for Third Parties from the Horizon Europe project Unified Transcription and
Translation for Extended Reality: 2023
- NeurIPS Workshop on Instruction Tuning and Instruction Following: 2023
- Conference on Empirical Methods in Natural Language Processing (EMNLP): 2023
- Conference on Machine Translation (WMT): 2021, 2022
- International Workshop on Semantic Evaluation (SemEval): 2022
- Teaching Assistant at University of Edinburgh:
- Machine Learning Practical: mentor and marker, 2020-21, 2021-22, 2022-23
- One project I supervised was shortlisted for a best project prize donated
by IBM UK (5 out of 88) and published at LREC-COLING 2024.
- Introductory Applied Machine Learning: marker, 2020-21, 2021-22
- Informatics Research Proposal: tutor, 2020-21
- System Design Project: mentor, 2018-19
Research
* denotes equal contribution.
Preprints
-
Fine-tuning large language models with sequential
instructions
Hanxu Hu*, Pinzhen Chen*, and Edoardo M. Ponti.
arXiv preprint.
-
EEE-QA: Exploring effective and efficient
question-answer
representations
Zhanghao Hu*, Yijun Yang*, Junjie Xu*, Yifu Qiu, and Pinzhen Chen.
Accepted to the 2024 Joint International Conference on Computational Linguistics, Language Resources and
Evaluation.
[.pdf
| code]
-
Large language model inference with lexical
shortlisting
Nikolay Bogoychev, Pinzhen Chen, Barry Haddow, and Alexandra Birch.
Accepted to AAAI 2024 Workshop on Deployable AI.
-
Iterative translation refinement with large
language models
Pinzhen Chen, Zhicheng Guo, Barry Haddow, and Kenneth Heafield.
arXiv preprint.
Publications
-
Monolingual or multilingual
instruction tuning:
Which makes a better Alpaca
Pinzhen Chen*, Shaoxiong Ji*, Nikolay Bogoychev, Andrey Kutuzov, Barry Haddow, and Kenneth
Heafield.
In Findings of the Association for Computational Linguistics: EACL 2024.
[.pdf
| .bib
| code
| data]
-
PMIndiaSum: Multilingual and
cross-lingual headline summarization for languages in India
Ashok Urlana*, Pinzhen Chen*, Zheng Zhao, Shay B. Cohen, Manish Shrivastava, and Barry Haddow.
2023.
In Findings of the Association for Computational Linguistics: EMNLP 2023.
[.pdf
| .bib
| poster
| code
| data]
-
Terminology-aware translation with
constrained decoding and large language model prompting
Nikolay Bogoychev* and Pinzhen Chen*. 2023.
In Proceedings of the Eighth Conference on Machine Translation.
[.pdf
| .bib
| poster]
-
Towards effective disambiguation for
machine translation with large language models
Vivek Iyer, Pinzhen Chen, and Alexandra Birch. 2023.
In Proceedings of the Eighth Conference on Machine Translation.
[.pdf
| .bib
| data]
-
Exploring data augmentation for
code generation tasks
Pinzhen Chen and Gerasimos Lampouras. 2023.
In Findings of the Association for Computational Linguistics: EACL 2023.
[.pdf
| .bib
| poster
| talk]
-
The University of Edinburgh's submission
to the WMT22 code-mixing shared task (MixMT)
Faheem Kirefu, Vivek Iyer, Pinzhen Chen, and Laurie Burchell. 2022.
In Proceedings of the Seventh Conference on Machine Translation.
[.pdf
| .bib]
-
Edinburgh at SemEval-2022 task 1:
Jointly fishing for word embeddings and definitions
Pinzhen Chen and Zheng Zhao. 2022.
In Proceedings of the 16th International Workshop on Semantic Evaluation.
[.pdf
| .bib
| poster
| talk
| code
| best paper honorable
mention out of 221]
-
A unified model for reverse dictionary
and definition modelling
Pinzhen Chen and Zheng Zhao. 2022.
In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational
Linguistics and the 12th International Joint Conference on Natural Language Processing.
[.pdf
| .bib
| poster
| talk
| code]
-
Approaching neural Chinese word
segmentation as a low-resource machine translation task
Pinzhen Chen and Kenneth Heafield. 2022.
In Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation.
[.pdf
| .bib
| best paper award out of 94]
-
To adapt or to fine-tune: A case study on
abstractive summarization
Zheng Zhao and Pinzhen Chen. 2022.
In Proceedings of the 21st Chinese National Conference on Computational Linguistics.
[.pdf
| .bib
| poster
| code]
-
The highs and lows of simple lexical
domain adaptation approaches for neural machine translation
Nikolay Bogoychev* and Pinzhen Chen*. 2021.
In Proceedings of the Second Workshop on Insights from Negative Results in NLP.
[.pdf
| .bib
| talk
| poster]
-
Efficient machine translation with model
pruning and quantization
Maximiliana Behnke, Nikolay Bogoychev, Alham Fikri Aji, Kenneth Heafield, Graeme Nail, Qianqian Zhu,
Svetlana Tchistiakova, Jelmer van der Linde, Pinzhen Chen, Sidharth Kashyap, and Roman Grundkiewicz.
2021.
In Proceedings of the Sixth Conference on Machine Translation.
[.pdf
| .bib
| poster]
-
The University of Edinburgh's English-German
and English-Hausa submissions to the WMT21 news translation task
Pinzhen Chen, Jindřich Helcl, Ulrich Germann, Laurie Burchell, Nikolay Bogoychev, Antonio Valerio
Miceli Barone, Jonas Waldendorf, Alexandra Birch, and Kenneth Heafield. 2021.
In Proceedings of the Sixth Conference on Machine Translation.
[.pdf
| .bib
| poster]
-
The University of Edinburgh's
Bengali-Hindi submissions to the WMT21 news translation task
Proyag Pal, Alham Fikri Aji, Pinzhen Chen, Sukanta Sen. 2021.
In Proceedings of the Sixth Conference on Machine Translation.
[.pdf
| .bib
| poster]
-
Parallel sentence mining by constrained
decoding
Pinzhen Chen*, Nikolay Bogoychev*, Kenneth Heafield, and Faheem Kirefu. 2020.
In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
[.pdf
| .bib
| talk
| code]
-
ParaCrawl: Web-scale acquisition of
parallel corpora
Marta Bañón, Pinzhen Chen, Barry Haddow, Kenneth Heafield, Hieu Hoang, Miquel Esplà-Gomis, Mikel L.
Forcada, Amir Kamran, Faheem Kirefu, Philipp Koehn, Sergio Ortiz Rojas, Leopoldo Pla Sempere, Gema
Ramírez-Sánchez, Elsa Sarrías, Marek Strelec, Brian Thompson, William Waites, Dion Wiggins, and Jaume
Zaragoza. 2020.
In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
[.pdf
| .bib
| talk
| website]
-
Character mapping and ad-hoc adaptation:
Edinburgh's IWSLT 2020 open domain translation system
Pinzhen Chen, Nikolay Bogoychev, and Ulrich Germann. 2020.
In Proceedings of the 17th International Conference on Spoken Language Translation.
[.pdf
| .bib
| talk]
Thesis
Personal
I enjoy travelling and cooking. I sometimes play badminton, basketball, as well as board and card games.