    Machine learning (ML) is revolutionizing scientific research, by utilizing its ever-growing capability of knowledge discovery from massive simulation data. With the recent arrival of exascale computer, new opportunities and challenges are emerging to maximize the synergy between exaFLOPS supercomputers and ML-boosted computational science. This thesis addresses two important and intertwined problems: (i) developing efficient billion-core parallel algorithms for scientific simulations, and (ii) designing domain-specific deep neural networks for scientific data analysis. To achieve these goals, I have designed a series of algorithms with specific focus on molecular dynamics simulations and protein folding as archetypal scientific applications. Specific contributions include: (i) communication-minimizing shift-collapse algorithm for n-tuple computation on central processing units; (ii) tuple-decomposition for graphics processing unit (GPU) acceleration of n-tuple computation; (iii) graph neural network (GNN) to classify different crystalline phases; and (iv) multiscale neural network to predict protein contact map.

