Aston Zhang is a research scientist at Meta Generative AI, building large language models (Llama). Prior to this, he was a scientist/manager at Amazon Web Services AI Research, studying language and multimodal models. He received an ICLR Outstanding Paper Award, an ACM Ubicomp Distinguished Paper Award, and an ACM SenSys Best Paper Award Nomination. His Dive into Deep Learning textbook is adopted worldwide. He obtained a Ph.D. in Computer Science from University of Illinois Urbana-Champaign.

If you are interested in research internship on large language models with our team in 2024, feel free to email me.


Papers (All)


  • with A. J. Smola
    Attention in Deep Learning [Keynote] [PDF] [Video]
    In The 36th International Conference on Machine Learning (ICML), 2019

  • with H. Lin, X. Shi, L. Lausen, H. He, S. Zha, and A. J. Smola
    Dive into Deep Learning for Natural Language Processing
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019

  • with H. Lin, L. Lausen, S. Zha, A. J. Smola, C. Wang, and M. Li
    From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond [Website]
    In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019

  • with H. Zhang, T. He, Z. Zhang, Z. Zhang, H. Lin, and M. Li
    Everything You Need to Know to Reproduce SOTA Deep Learning Models from Hands-on Tutorial
    In International Conference on Computer Vision (ICCV), 2019


  • Area Chair
    • Annual Meeting of the Association for Computational Linguistics (ACL)
    • Conference on Empirical Methods in Natural Language Processing (EMNLP)
    • International Conference on Computational Linguistics (COLING)