CV
Education
- Ph.D. in Computational Science and Engineering, Georgia Institute of Technology, 2019.5 - 2024.8 (expected), advisor: Prof. Edmond Chow.
- M.S. in Computer Science, Georgia Institute of Technology, 2017.8 - 2019.5, advisor: Prof. Edmond Chow.
- B.S. in Information and Computing Science, Sun Yat-sen University, 2013.8 - 2017.6, advisor: Dr. Weicai Ye(叶纬材博士).
Research Experience
- Graduate Research Assistant, Georgia Institute of Technology (2018.1 - 2024.8, supervisor: Prof. Edmond Chow)
- Accelerating quantum chemistry calculations: up to 6x overall speedup of Hartree-Fock calculation for a 1208-atom molecular system with multiple optimizations.
- Developing the H2Pack library: up to 6x speedup over state-of-the-art FMM-based method in high-accuracy calculations, 90% memory usage reduction and a 7x speedup compared to contemporary methods for quantum chemistry and hydrodynamic simulation calculations.
- Designing parallel linear algebra algorithms: up to 25% and 4x speedup for dense / sparse matrix multiplications compared to state-of-the-art method.
Work Experience
- Software Engineer Intern, Meta (Facebook) Inc (2021.5 - 2021.8, supervisor: Dr. Xing Liu)
- Investigated different parallelization schemes of the Mixture of Experts (MoE) model.
- Implemented a PyTorch-based hybrid parallel MoE model for a recommendation system.
- Delivered a 6x speedup on 64 NVIDIA V100 GPUs compared to the existing MoE model deployed in production.
- Research Intern, Intel Corporation (2020.8 - 2020.12, supervisor: Dr. Jeff Hammond)
- Analyzed the code of Kokkos Remote Workspace and NVSHMEM.
- Designed and implemented CL-PGAS using SYCL + MPI-3, the first library in the industry enabling inter-node data transfer controlled by kernels running on Intel / NVIDIA / AMD GPUs.
- Achieved a 95 % performance level compared to NVIDIA’s proprietary library NVSHMEM.
- Assistant Engineer, HPC Department of National Super-Computing Center in Shenzhen (2015.8 - 2015.9, supervisor: Dr. Jianwen Liu (刘建文博士)
- Optimized the performance of two major applications in NSCC-SC: VASP (25%) and Amber (30%).
- Provided 50+ customer technical consultation and support.
Skills
- Programming languages: C, C++, MATLAB, Fortran, Python
- Frameworks and technologies: OpenMP, MPI, CUDA, OpenCL, SYCL, x86 intrinsics, ARM (NEON and SVE) intrinsics
Publications
Publications are listed in reverse chronological order. The most up-to-date list of my publications can be found on my Google Scholar profile. The bibtex file for all my published papers can be downloaded here.
Submitted Papers and Preprints
[S.1] Exploring the Design Space of Distributed Parallel Sparse Matrix‑Multiple Vector Multiplication. Hua Huang and Edmond Chow.
Peer-reviewed Conference Publications
[C.3] [SC 22] CA3DMM: A New Algorithm Based on a Unified View of Parallel Matrix Multiplication.
Hua Huang and Edmond Chow.
International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2022.
pdf ACM (includes presentation video)
[C.2] [IPDPS 19] Overlapping Communications with Other Communications and its Application to Distributed Dense Matrix Computations. Hua Huang and Edmond Chow.
International Parallel & Distributed Processing Symposium (IPDPS), 2019.
pdf IEEE
[C.1] [SC 18] Accelerating Quantum Chemistry with Vectorized and Batched Integrals.
Hua Huang and Edmond Chow.
International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2018.
pdf ACM
Peer-reviewed Journal Publications
[J.8] [SIAM J. Sci. Comput.] An Adaptive Factorized Nystrom Preconditioner for Kernel Matrices. Shifan Zhao, Tianshi Xu, Hua Huang, Edmond Chow, and Yuanzhe Xi.
SIAM Journal on Scientific Computing (accepted, to appear).
arXiv
[J.7] [SIAM J. Sci. Comput.] Data-Driven Construction of Hierarchical Matrices with Nested Bases.
Difeng Cai, Hua Huang, Edmond Chow, and Yuanzhe Xi.
SIAM Journal on Scientific Computing (2023).
pdf SIAM
[J.6] [J. Comput. Phys.] A Hierarchical Matrix Approach for Computing Hydrodynamic Interactions.
Xin Xing, Hua Huang, and Edmond Chow.
Journal of Computational Physics, 448, 110761 (2022).
pdf ScienceDirect
[J.5] [SoftwareX] SPARC: Simulation Package for Ab-initio Real-space Calculations.
Qimen Xu, Abhiraj Sharma, Benjamin Comer, Hua Huang, Edmond Chow, Andrew J Medford, John E Pask, and Phanish Suryanarayana. SoftwareX, 15, 100709 (2021).
pdf ScienceDirect
[J.4] [SIAM J. Matrix Anal. Appl.] Efficient Construction of an HSS Preconditioner for Symmetric Positive Definite H2 Matrices.
Xin Xing, Hua Huang, and Edmond Chow.
SIAM Journal on Matrix Analysis and Applications, 42(2), 683–707 (2021).
pdf SIAM
[J.3] [ACM Trans. Math. Softw] H2Pack: High-Performance H2 Matrix Package for Kernel Matrices Using the Proxy Point Method.
Hua Huang, Xin Xing, and Edmond Chow.
ACM Transactions on Mathematical Software, 47 (1), 1-29 (2020).
pdf ACM
[J.2] [J. Chem. Phys.] A Linear Scaling Hierarchical Block Low-rank Representation of the Electron Repulsion Integral Tensor.
Xin Xing, Hua Huang, and Edmond Chow.
Journal of Chemical Physics, 153, 084119 (2020).
pdf AIP
[J.1] [J. Chem. Phys.] Techniques for High-Performance Construction of Fock Matrices.
Hua Huang, David Sherrill, and Edmond Chow.
Journal of Chemical Physics, 152, 024122 (2020).
pdf AIP
Contributed and Invited Talks
Talks are listed in reverse chronological order.
- Exploring the Design Space of Distributed Parallel Sparse Matrix‑Multiple Vector Multiplication
- invited talk at PP24: SIAM Conference on Parallel Processing for Scientific Computing (invited by Dr. Georg Hager), May 2024;
- New Parallel Algorithms for Quantum Chemistry and Large-Scale Matrix Computation
- invited talk at Sun Yat-sen University (invited by Dr. Yutong Lu), Sep 2023
- CA3DMM: A New Algorithm Based on a Unified View of Parallel Matrix Multiplication, held at various occasions including:
- paper presentation at SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, Nov 2022;
- lightening talk at GSCS 2022: Georgia Scientific Computing Symposium, Feb 2022.
- Towards Portable PGAS on GPUs
- invited talk at AN21: SIAM Annual Meeting 2021 (invited by Dr. Min Si), July 2021.
- Overlapping Communications with Other Communications and its Application to Distributed Dense Matrix Computations, held at various occasions including:
- invited talk at PP20: SIAM Conference on Parallel Processing for Scientific Computing (invited by Dr. Roel Van Beeumen), Feb 2020;
- paper presentation at IPDPS19: IEEE International Parallel and Distributed Processing Symposium , May 2019.
- Accelerating Quantum Chemistry with Vectorized and Batched Integrals, held at various occasions including:
- invited talk at Emory University (invited by Dr. Yuanzhe Xi), March 2019;
- paper presentation at SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, Nov 2018.
Professional Service
Reviewer for conferences
- International Conference for High‑Performance Computing, Networking, Storage, and Analysis (SC)
- SC24, technical program committee: AD/AE reproducibility
- IEEE International Parallel & Distributed Processing Symposium (IPDPS)
- IPDPS24, technical program committee: Measurements, Modeling, and Experiments (MMEXP) track
Reviewer for journal
- Journal of Supercomputing
Program committees
- SC19 student cluster competition
Honors and Awards
- 2023 ACM-IEEE CS George Michael Memorial HPC Fellowships Honorable Mention
- 2020 Sigma Xi Best M.S. Thesis Research Award (Sigma Xi Georgia Tech Chapter)
- 2019 Georgia Tech Marshall D. Williamson Fellowship