Aston Zhang is a research scientist at Meta Generative AI, building large language models (Llama). Prior to this, he was a scientist/manager at Amazon Web Services AI Research, studying language and multimodal models. He received an ICLR Outstanding Paper Award, an ACM Ubicomp Distinguished Paper Award, and an ACM SenSys Best Paper Award Nomination. His Dive into Deep Learning textbook is adopted worldwide. He obtained a Ph.D. in Computer Science from University of Illinois Urbana-Champaign.
If you are interested in research internship on large language models with our team in 2024, feel free to email me.
Books
- A. Zhang, Z. C. Lipton, M. Li, and A. J. Smola
Dive into Deep Learning
Cambridge University Press, 2023
- Adopted at 500 universities from 70 countries (on Amazon)
- Featured in the AWS re:Invent keynote by Swami, Head of AWS AI, Database, and Analytics
- A. Zhang, M. Li, Z. C. Lipton, and A. J. Smola
动手学深度学习
人民邮电出版社, 2nd ed., 2023, 1st ed., 2019
- Best seller in China
- Best seller in China
Papers (All)
Z. Zhang and A. Zhang
You Only Look at Screens: Multimodal Chain-of-Action Agents
“Perform a task on smart phones? Train an agent using screenshots.” In arXiv, 2023
Z. Zhang, A. Zhang, M. Li, H. Zhao, G. Karypis, and A. J. Smola
Multimodal Chain-of-Thought Reasoning in Language Models
“Imagine reading a book without figures: Multimodal-CoT surpasses humans on ScienceQA.” In arXiv, 2023
[Idea Inspiration by Homeschooling]
S. Ren, A. Zhang, Y. Zhu, S. Zhang, S. Zheng, M. Li, A. J. Smola, X. Sun
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2023
Z. Zeng, C. Hawkins, M. Hong, A. Zhang, N. Pappas, V. Singh, and S. Zheng
Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2023J. Chen, A. Zhang, X. Shi, M. Li, A. J. Smola, and D. Yang
Parameter-Efficient Fine-Tuning Design Spaces
In Proceedings of the International Conference on Learning Representations (ICLR), 2023
Z. Zhang, A. Zhang, M. Li, and A. J. Smola
Automatic Chain of Thought Prompting in Large Language Models
In Proceedings of the International Conference on Learning Representations (ICLR), 2023
Z. Liu, Z. Tang, X. Shi, A. Zhang, M. Li, A. Shrivastava, and A. Wilson
Learning Multimodal Data Augmentation in Feature Space
In Proceedings of the International Conference on Learning Representations (ICLR), 2023T. Yang, Y. Zhu, Y. Xie, A. Zhang, C. Chen, and M. Li
AIM: Adapting Image Models for Efficient Video Understanding
In Proceedings of the International Conference on Learning Representations (ICLR), 2023C. Qin, A. Zhang, Z. Zhang, J. Chen, M. Yasunaga, and D. Yang
Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
In Empirical Methods in Natural Language Processing (EMNLP), 2023J. Chen, A. Zhang, D. Yang, M. Li, and A. J. Smola
A Cheaper and Better Diffusion Language Model with Soft-Masked Noise
In Empirical Methods in Natural Language Processing (EMNLP), 2023
H. Wang, A. Zhang, Y. Zhu, S. Zheng, M. Li, A. J. Smola, and Z. Wang
Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition
In Proceedings of International Conference on Machine Learning (ICML, Long Presentation), 2022
H. Wang, A. Zhang, S. Zheng, X. Shi, M. Li, and Z. Wang
Removing Batch Normalization Boosts Adversarial Training
In Proceedings of International Conference on Machine Learning (ICML), 2022
A. Zhang, Y. Tay, S. Zhang, A. Chan, A. T. Luu, S. C. Hui, and J. Fu
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters
In Proceedings of the International Conference on Learning Representations (ICLR, Outstanding Paper Award), 2021
Tutorials
with A. J. Smola
Attention in Deep Learning [Keynote] [PDF] [Video]
In The 36th International Conference on Machine Learning (ICML), 2019with H. Lin, X. Shi, L. Lausen, H. He, S. Zha, and A. J. Smola
Dive into Deep Learning for Natural Language Processing
In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019with H. Lin, L. Lausen, S. Zha, A. J. Smola, C. Wang, and M. Li
From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond [Website]
In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019with H. Zhang, T. He, Z. Zhang, Z. Zhang, H. Lin, and M. Li
Everything You Need to Know to Reproduce SOTA Deep Learning Models from Hands-on Tutorial
In International Conference on Computer Vision (ICCV), 2019
Services
- Area Chair
- Annual Meeting of the Association for Computational Linguistics (ACL)
- Conference on Empirical Methods in Natural Language Processing (EMNLP)
- International Conference on Computational Linguistics (COLING)
Follow @astonzhangAZ
Tweets by astonzhangAZ