1. DOE Exascale Initiative Roadmap (2009) Architecture and technology workshop. San Diego

2. DOE Office of Science Summary report of the Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee. The opportunities and challenges of exascale computing, Fall 2010

3. Shalf J, Dosanjh S, Morrison J (2011) Exascale computing technology challenges. VECPAR 2010. LNCS 6449:1–25

4. Tian R, Sun N (2013) Some considerations about exascale computing in China. Commun China Comput Fed 9(2):52–60 (in Chinese)

5. Tian R (2013) Co-design thinking towards exascale computing. Inf Technol Lett 10(3):50–63 (in Chinese)

7. Thibodeau P (2012) Exascale unlikely before 2020 due to budget woes. Computerworld. Nov 19, 2012

8. Harrod W (2012) DOE exascale computing Initiative (ECI) update. DOE, Office of Science (SC), Oct 4, 2012

9. Dongarra J (2013) Emerging heterogeneous technologies for high performance computing. 22nd International Heterogeneity in Computing Workshop. IPDP, Boston

10. DOE E3 Report, http://www.er.doe.gov/ascr/ProgramDocuments/ProgDocs.html

11. A platform, strategy for the advanced simulation and computing, Program (NA-ASC-113R-07-Vol. 1-Rev. 0)

13. Chen J, Bell J (2011) Combustion exascale co-design center. Sixth international exascale software project workshop, San Francisco, April 6–7

14. Dennis JM, Edwards J, Guba O, St-Cyr A, Taylor MA, Worley PH, (2012) CAM-SE: a scalable spectral element dynamical core for the community atmosphere model. Int J High Perform Comput Appl 26(1):74–89

17. Eisenbach M, Zhou CG, Nicholson DM, Brown G, Larkin J, and Schulthess TC (2010) Thermodynamics of magnetic systems from first principles: WL-LSMS. In the proceeding of the 52nd Cray User Group meeting, CUG 2010

19. Tian R (2013) Meshfree/GFEM in hardware-efficiency prospective. Interaction and multiscale mechanics. DOI:10.12989/imm.2013.6.2.000

20. Tian R (2013) Extra-dof-free and linearly independent enrichments in GFEM. Comput Method Appl Mech Eng 266:1–22

21. Babuška I, Melenk JM (1997) Partition of unity method. Int J Numer Method Eng 40:727–758

22. Melenk JM, Babuška I (1996) The partition of unity finite element method: basic theory and applications. Comput Method Appl Mech Eng 139:289–314

23. Babuška I, Caloz G, Osborn JE (1994) Special finite element methods for a class of second order elliptic problems with rough coefficients. SIAM J Numer Anal 31:945–981

24. Duarte CA, Oden JT (1996) An h-p adaptive method using clouds. Comput Methods Appl Mech Eng 139(1–4):237–262

25. Oden JT, Duarte CA, Zienkiewicz OC (1998) A new cloud-based hp finite element method. Comput Method Appl Mech Eng 153(1–2):117–126

26. Strouboulis T, Babuška I, Copps K (2000) The design and analysis of the generalized finite element method. Comput Method Appl Mech Eng 181(1–3):43–69

27. Strouboulis T, Copps K, Babuška I (2000) The generalized finite element method: an example of its implementation and illustration of its performance. Int J Numer Method Eng 47:1401–1417

28. Strouboulis T, Copps K, Babuška I (2001) The generalized finite element method. Comput Method Appl Mech Eng 190(32–33):4081–4193

29. Strouboulis T, Zhang L, Babuška I (2003) Generalized finite element method using mesh-based handbooks: application to problems in domains with many voids. Comput Method Appl Mech Eng 192:3109–3161

30. Strouboulis T, Zhang L, Babuška I (2004) \(p\)-version of the generalized FEM using mesh-based handbooks with applications to multiscale problems. Int J Numer Method Eng 60:1639–1672

31. Strouboulis T, Zhang L, Wang D, Babuška I (2006) A posteriori error estimation for generalized finite element methods. Comput Method Appl Mech Eng 195:852–879

32. Strouboulis T, Babuška I, Hidajat R (2006) The generalized finite element method for Helmholtz equation: theory, computation, and open problems. Comput Method Appl Mech Eng 195:4711–4731

33. Strouboulis T, Hidajat R, Babuška I (2008) The generalized finite element method for Helmholtz equation, part II: effect of choice of handbook functions, error due to absorbing boundary conditions and its assessment. Comput Method Appl Mech Eng 197:364–380

34. Duarte CA, Babuška I, Oden JT (2000) Generalized finite element methods for three-dimensional structural mechanics problems. Comput Struct 77:215–232

35. Duarte CA, Hamzeh ON, Liszka TJ, Tworzydlo WW (2001) A generalized finite element method for the simulation of three-dimensional dynamic crack propagation. Comput Method Appl Mech Eng 190:2227–2262

36. Simone A, Duarte CA, Van der Giessen E (2006) A generalized finite element method for polycrystals with discontinuous grain boundaries. Int J Numer Method Eng 67:1122–1145

37. Duarte CA, Kim DJ (2008) Analysis and applications of a generalized finite element method with global-local enrichment functions. Comput Method Appl Mech Eng 197(6–8):487–504

38. O’Hara P, Duarte CA, Eason T (2009) Generalized finite element analysis for three dimensional problems exhibiting sharp thermal gradients. Comput Method Appl Mech Eng 198:1857–1871

39. Lancaster P, Salkauskas K (1981) Surfaces generated by moving least squares methods. Math Comput 37:141–158

41. Belytschko T, Lu YY, Gu L (1994) Element-free Gakerkin method. Int J Numer Method Eng 37:229–256

42. Li S, Liu WK (2001) Meshfree and particle methods and their applications. Appl Mech Rev 55:1–34

43. Cecka C, Lew A, Darve E (2011) Assembly of finite element methods on graphics processors. Int J Numer Method Eng 85(5):640–669

44. Karatarakis A, Metsis P, Papadrakakis M (2013) GPU-acceleration of stiffness matrix calculation and efficient initialization of EFG meshless methods. Comput Method Appl Mech Eng (in press), Accepted Manuscript, Available online 4 March 2013

45. Buttari A, Dongarra J, Kurzak J, Luszczek P, Tomov S (2008) Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy. ACM Trans Math Softw (TOMS) 34(4):1–22

46. Göddeke D, Strzodka R, Turek S (2005) Accelerating double precision FEM simulations with GPUs. In Proceedings of ASIM 2005–18th symposium on simulation technique

47. Strzodka R, Göddeke D (2006) Pipelined mixed precision algorithms on FPGAs for fast and accurate PDE solvers from low precision components. In IEEE symposium on field-programmable custom computing machines (FCCM 2006), pp 259–268

48. Strzodka R, Göddeke D (2006) Mixed precision methods for convergent iterative schemes. In Proceedings of the 2006 workshop on edge computing using new commodity architectures, pp D-59-60, May 2006

49. Göddeke D, Strzodka R, Turek S (2007) Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations. Int J Parallel, Emergent Distrib Syst (IJPEDS), Special issue: Appl. Parallel Comput 22(4):221–256

50. Göddeke D, Strzodka R (2008) Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (part 2: Double precision GPUs). Technical University Dortmund, Technical report

51. Jakub Kurzak, Jack Dongarra (2007) Implementation of mixed precision in solving systems of linear equations on the Cell processor. Concurr Comput Pract Experience 19(10):1371–1385

52. Wilkinson JH (1963) Rounding errors in algebraic processes. Prentice-Hall, Englewood Cliffs

53. Moler CB (1967) Iterative refinement in floating point. J ACM 14(2):316–321

54. Jankowski M, Woniakowski H (1977) Iterative refinement implies numerical stability. J BIT Numer Math 17(3):303–311

55. Higham NJ (2002) Accuracy and stability of numerical algorithms. Society for Industrial and Applied Mathematics, Philadelphia

56. Buttari A, Dongarra J, Langou J, Langou J, Luszczek P, Kurzak J (2007) Mixed precision iterative refinement techniques for the solution of dense linear systems. Int J High Perform Comput Appl 21:457–466

57. Langou J, Langou J, Luszczek P, Kurzak J, Buttari A, Dongarra J (2006) Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems). Proceedings of the 2006 ACM/IEEE conference on supercomputing

58. Demmel JW (1997) Applied numerical linear algebra. SIAM Press, Philadelphia

59. Demmel J, Hida Y, Kahan W, Li XS, Mukherjee S, Riedy EJ (2005) Error bounds from extra precise iterative refinement. Technical Report No. UCB/CSD-04-1344, LAPACK Working Note 165, Feb 2005

60. Taiji M, Narumi T, Ohno Y, Futatsugi N, Suenaga A, Takada N, Konagaya A (2003) Protein explorer: a petaflops special-purpose computer system for molecular dynamics simulations. Proceedings of Supercomputing 2003 in CD-ROM

61. Anderson E, Bai Z, Bischof C, Blackford LS, Demmel JW, Dongarra JJ, Du Croz J, Greenbaum A, Hammarling S, McKenney A, Sorensen D LAPACK Users’ Guide. SIAM, http://www.netlib.org/lapack/

62. Li XS, Demmel JW, Bailey DH, Henry G, Hida Y, Iskandar J, Kahan W, Kang SY, Kapur A, Martin MC, Thompson BJ, Tung T, Yoo DJ (2002) Design, implementation and testing of extended and mixed precision BLAS. ACM Trans Math Softw (TOMS) 28(2):152–205

63. Göddeke D, Strzodka R, Turek S (2007) Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations. Int J Parallel Emergent Distrib Syst (Special Issue: Applied Parallel Computing) 22(4):221–256

64. Göddeke D, Wobker H, Strzodka R, Mohd-Yusof J, McCormick P, Turek S (2009) Co-processor acceleration of an unmodified parallel solid mechanics code with FEASTGPU. Int J Comput Sci Eng 4(4):254–269

65. Strzodka R, Göddeke D (2006) Pipelined mixed precision algorithms on FPGAs for fast and accurate PDE solvers from low precision components. In FCCM’06: Proceedings of the 14th annual IEEE symposium on field-programmable custom computing machines (FCCM’06) pp 259–270

66. Strzodka R, Göddeke D (2006) Mixed precision methods for convergent iterative schemes. In Proceedings of the 2006 workshop on edge computing using new commodity architectures, D-59-60

67. Kurzak J, Dongarra JJ (2007) Implementation of mixed precision in solving systems of linear equations on the CELL processor. Concurr Comput Pract Experience 19(10):1371–1385

68. Jianhua Liu, Chaowei Wang, Jiangyong Ren, Rong Tian (2012) A mixed precision explicit finite element algorithm on heterogeneous architecture and its CUDA implementation. Comput Sci 39(6):293–296 (in Chinese)

69. Liu J (2011) A mixed precision GPU acceleration algorithm and its application to FEM. MS thesis of Graduate School of Chinese Academy of Sciences (in Chinese)

70. Aifantis EC (1992) On the role of gradients in the localization of deformation and fracture. Int J Eng Sci 30(10):1279–1299

71. Hill R (1963) Elastic properties of reinforced solids: some theoretical principles. J Mech Phys Solids 11(5):357–372

72. Hill R (1972) On constitutive macro-variables for heterogeneous solids at finite strain. Proc R Soc Lond Ser A Math Phys Sci 326(1565):131–147

73. Tian R, Yagawa G (2005) Generalized node and high-performance elements. Int J Numer Method Eng 64:2039–2071

74. Tian R, Yagawa G, Terasaka H (2006) Linear dependence problems of partition of unity based generalized FEMs. Comput Method Appl Mech Eng 195:4768–4782

75. Tian R (2006) A PU-based 4-node quadratic tetrahedron and linear dependence elimination in three dimensions. Int J Comput Method 3:545–562

76. Tian R, Matsubara H, Yagawa G (2006) Advanced 4-node tetrahedrons. Int J Numer Methods Eng 68:1209–1231

77. Tian R, Yagawa G (2006) Allman’s triangle, rotational dof and partition of unity. Int J Numer Method Eng 69:837–858

78. McVeigh C, Liu WK (2010) Multiresolution continuum modeling of micro-void assisted dynamic adiabatic shear band propagation. J Mech Phys Solid 58(2):187–205

79. McVeigh C, Vernerey F, Liu WK, Brinson C (2006) Multiresolution analysis for material design. Comput Method Appl Mech Eng 195:5053–5076

80. McVeigh C, Vernerey FJ, Liu WK, Moran B, Olson GB (2007) An Interactive microvoid shear localization mechanism in high strength steels. J Mech Phys Solids 55(2):224–225

81. McVeigh C (2007) Ph.D. Thesis, Northwestern University

82. McVeigh C, Liu WK (2008) Linking microstructure and properties through a predictive multiresolution continuum. Comput Method Appl Mech Eng 197:3268–3290

83. McVeigh C, Liu WK (2009) Multiresolution modeling of ductile reinforced brittle composites. J Mech Phys Solids 57:244– 267

84. Tian R, Moran B, Liu WK, Olson GB (2008) Multiscale fracture simulator. Dynamic microstructure design consortium (ONR Contract: N00014–05-C-0241) base final Report

85. Tian R, Liu WK, Chan S, Olson GB, Tang S, Wang JS, Jou HJ, Gong JD, Moran B (2009) Linking microstructures to fracture toughness-predictive 3D process zone simulations. The D 3-D annual PI Review, Evanston, March 23–25

86. Rong Tian, Stephanie Chan, Shan Tang, Kopacz Adrian M, Jian-Sheng Wang, Herng-Jeng Jou, Larbi Siad, Lars-Erik Lindgren, Gregory Olson, Kam Liu Wing (2010) A multi-resolution continuum simulation of the ductile fracture process. J Mech Phys Solids 58(10):1681–1700

87. Dongarra J et al The international exascale software project roadmap. www.iesp.org

88. Schroeder B, Gibson GA (2006) A large-scale study of failures in high-performance computing systems. Proceedings of the international conference on dependable systems and networks pp 249–258

89. Liu Y (2007) Reliability-aware optimal checkpoint/restart model in high performance computing, PhD Thesis. Louisiana

90. Cappello F, Geist A, Gropp B et al (2009) Toward exascale resilience. Int J High Perform C 23:374–388

91. Geist A (2009) Co-design challenges going from petascale to exascale. Workshop on bio-molecular simulations on future computing architectures, Oak Ridge

92. Li L, Wang C, Ma Z, Tian R (2013) petaPar: a highly scalable and fault tolerant meshfree/particle simulation code based on free assembly mesh. HPC China 2013, Guilin, China, October 29–31, 2013

93. Gingold RA, Monaghan JJ (1977) Smoothed particle hydrodynamics: theory and application to non-spherical stars. Mon Not R Astron Soc 181:375–389

94. Libersky LD, Petschek AG (1990) Smooth particle hydrodynamics with strength of materials. Adv Free Lagrange Method Lect Notes Phys 395:248–257

95. Liu MB, Liu GR (2010) Smoothed particle hydrodynamics (SPH): an overview and recent developments. Arch Comput Method Eng 17:25–76

96. Warren MS, Salmon JK (1995) A portable parallel particle program. Comput Phys Commun 87(1):266–290

97. Goozee RJ, Jacobs PA (2003) Distributed and shared memory parallelism with a smoothed particle hydrodynamics code. Anziam J 44:202–228

98. Maruzewski P, TouzéD L, Oger G et al (2010) SPH high-performance computing simulations of rigid solids impacting the free-surface of water. J Hydraul Res 48(S1):126–134

99. Springel V (2005) The cosmological simulation code gadget-2. Mon Not R Astron Soc 364(4):1105–1134

100. Holmes DW, Williams JR, Tilke P (2011) A framework for parallel computational physics algorithms on multi-core: SPH in parallel. Adv Eng Softw 42(11):999–1008

101. Ihmsen M, Akinci N, Becker M et al (2011) A Parallel SPH Implementation on Multi-Core CPUs. Comput Graph Forum 30(1): 99–112

102. Harada T, Koshizuka S, Kawaguchi Y (2007), Smoothed particle hydrodynamics on GPUs. Proc Comput Graph Int pp 63–70

103. Hérault A, Bilotta G, Dalrymple RA (2010) SPH on GPU with CUDA. J Hydraul Res 48(S1):74–79

104. Valdez-Balderas D, Domínguez J M, Rogers BD, et al. (2012) Towards accelerating smoothed particle hydrodynamics simulations for free surface flows on multi-GPU clusters. J Parallel Distr Com

105. Domínguez JM, Crespo AJC, Valdez-Balderas D et al (2013) New multi-GPU implementation for smoothed particle hydrodynamics on heterogeneous clusters. Comput Phys Commun 184:1848–1860

106. Domínguez JM, Crespo AJC, Gómez-Gesteira M (2013) Optimization strategies for CPU and GPU implementations of a smoothed particle hydrodynamics method. Comput Phys Commun 184:617–627

107. Sulsky D, Chen Z, Schreyer HL (1994) A particle method for history-dependent materials. Comput Method Appl Mech Eng 118:179–196

108. Love E, Sulsky DL (2006) An unconditionally stable, energy-momentum consistent implementation of the material-point method. Comput Method Appl Mech Eng 195(33–36):3903–3925

109. Wallstedt PC, Guilkey JE (2008) An evaluation of explicit time integration schemes for use with the generalized interpolation material point method. J Comput Phys 227(22):9628–9642

110. Zhang Duan Z, Xia Ma, Giguere Paul T (2011) Material point method enhanced by modified gradient of shape function. J Comput Phys 230(16):6379–6398

111. Więckowski Z (2004) The material point method in large strain engineering problems. Comput Method Appl Mech Eng 193(39–41):4417–4438

112. Sulsky D, Kaul A (2004) Implicit dynamics in the material-point method. Comput Method Appl Mech Eng 193(12–14):1137–1170

113. Wang HK, Liu Y, Zhang X (2012) The carbon nanotube composite simulation by material point method. Comput Mater Sci 57:23–29

114. Zhang X, Sze KY, Ma S (2006) An explicit material point finite element method for hyper velocity impact. Int J Numer Method Eng 66:689–706

115. Lian YP, Zhang X, Liu Y (2012) An adaptive finite element material point method and its application in extreme deformation problems. Comput Method Appl Mech Eng 241–244:275–285

116. Lian YP, Zhang X, Liu Y (2011) Coupling of finite element method with material point method by local multi-mesh contact method. Comput Method Appl Mech Eng 200(47–48):3482–3494

117. Wiȩckowski Z (2004) The material point method in large strain engineering problems. Comput Method Appl Mech Eng 193(39–41):4417–4438

118. Sulsky D, Kaul A (2011) Implicit dynamics in the material-point method. Comput Method Appl Mech Eng 193(12–14):1137–1170

121. Joubert W (2012) Porting the denovo radiation transport code to Titan: lessons learned. OLCF Titan Workshop 2012

122. Franck Cappello (2009) Fault tolerance in petascale/ exascale systems: current knowledge, challenges and research opportunities. Int J High Perform Comput Appl 23:212–226

123. Keyes D (2012) Large-scale simulation in science and engineering: digesting the fruit, replanting the fields. Co-Design 2012, Beijing, China, October 23–25, 2012