HPC Application developers are hero programmers because writing parallel programs is much harder than writing sequential ones. They have to understand the intricate details of the target architecture (e.g. GPUs, etc), and the programming models to exploit this parallelism and express this in the code structure of the application. Using performance data, they need to go through thousands or millions of lines of code so that they can devise a strategy to port to the target architecture. Typically, most codes start by parallelizing the application across nodes and then by adding in-node parallelisms incrementally to exploit the multi-cores or accelerators available on the node until the target performance is met.
The data set for this challenge contains metadata of the program information for the E3SM (Energy Exascale Earth System Model) application. It contains information about the usage of Fortran features, programming models, subroutine calls, numbers of statements that are parallelized, type of statements, Fortran module usage, source and object file location of subroutines, etc.
Challenge Questions
We need a scalable way to visualize this information. The challenges are: