Aston Zhang is a research scientist on the Llama team at Meta Generative AI and a core contributor to Llama 3. Previously, he served as a scientist and manager at AWS AI Research. His accolades include the ICLR Outstanding Paper Award, the ACM Ubicomp Distinguished Paper Award, and an ACM SenSys Best Paper Award nomination. His textbook, “Dive into Deep Learning,” is adopted worldwide. He holds a Ph.D. in Computer Science from the University of Illinois Urbana-Champaign.

Current research: pre-training architectures & scaling, long context (Llama 4).

News

  • Join us for a 2025 research internship! Just email me if you are interested in improving Llama with our team.
  • Llama 3.1 405B is now openly available.
  • Meet Llama 3, our state-of-the-art open source large language model. Check out my developer podcast.

Books

  • A. Zhang, Z. C. Lipton, M. Li, and A. J. Smola
    Dive into Deep Learning
    Cambridge University Press, 2023
    • Adopted at 500 universities from 70 countries
    • Featured in the AWS re:Invent keynote by Swami, Head of AWS AI, Database, and Analytics
  • A. Zhang, M. Li, Z. C. Lipton, and A. J. Smola
    动手学深度学习
    人民邮电出版社, 2nd ed., 2023, 1st ed., 2019

Papers (All)

Tutorials

  • with A. J. Smola
    Attention in Deep Learning [Keynote] [PDF] [Video]
    In The 36th International Conference on Machine Learning (ICML), 2019

  • with H. Lin, X. Shi, L. Lausen, H. He, S. Zha, and A. J. Smola
    Dive into Deep Learning for Natural Language Processing
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019

  • with H. Lin, L. Lausen, S. Zha, A. J. Smola, C. Wang, and M. Li
    From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond [Website]
    In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019

  • with H. Zhang, T. He, Z. Zhang, Z. Zhang, H. Lin, and M. Li
    Everything You Need to Know to Reproduce SOTA Deep Learning Models from Hands-on Tutorial
    In International Conference on Computer Vision (ICCV), 2019

Services

  • Area Chair
    • Annual Meeting of the Association for Computational Linguistics (ACL)
    • Conference on Empirical Methods in Natural Language Processing (EMNLP)
    • International Conference on Computational Linguistics (COLING)