Programmers on parallel systems are increasingly turning to compiler-assisted parallel programming models such as OpenMP, OpenCL, Halide and TensorFlow. It is crucial to ensure that LLVM-based compilers can optimize parallel code, and sometimes the parallelism constructs themselves, as effectively as possible. At last year’s meeting, some of the organizers moderated a BoF that discussed the general issues for parallelism extensions in LLVM IR. Over the past year the organizers have investigated LLVM IR extensions for parallelism and prototyped an infrastructure that enables the effective optimization of parallel code in the context of C/C++/OpenCL. In this BoF, we will discuss several open issues regarding parallelism representations in LLVM with minimal LLVM IR extensions.
• What would be a minimal set of LLVM IR extensions? • What properties are implied by the semantics of these extensions for regions? For example, are restrictions on alloca movement or memory barriers implied? • How do we want to express these properties in LLVM IR? • How would different parts of LLVM need to be updated to handle these extensions and where is the proper place in the pipeline to lower these constructs?
The organizers will explain and share how we have extended our front-end and middle-end passes to produce LLVM IR in with a small set of LLVM IR extensions to represent parallel constructs. As an example, we can discuss how our prototype implementation supports OpenMP-like functionality in OpenCL* to provide enhanced autonomous-driving workload performance