Publications

(2018). High performance stencil code generation with Lift. Proceedings of the 16th ACM/IEEE International Symposium on Code Generation and Optimization.

(2018). Bulk-synchronous parallel simultaneous BVH traversal for collision detection on GPUs. Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games.

(2018). Automatic Matching of Legacy Code to Heterogeneous APIs: An Idiomatic Approach. Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems.

(2018). A Modular Approach to Performance, Portability and Productivity for 3D Wave Models . 7th International Workshop on Domain Specific Languages and High-level Frameworks for High Performance Computing.

(2017). Strategy Preserving Compilation for Parallel Functional Code. CoRR.

PDF

(2017). Performance Portability For Room Acoustics Simulations. Proceedings of the 20th International Conference on Digital Audio Effects.

(2017). ParTeCL: parallel testing using OpenCL. Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis.

(2017). Lift: A Functional Data-Parallel IR for High-Performance GPU Code Generation. Proceedings of the 15th ACM/IEEE International Symposium on Code Generation and Optimization.

(2017). Just-in-time gpu compilation for interpreted languages with partial evaluation. Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments.

(2017). Compiler-assisted test acceleration on gpus for embedded software. Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis.

(2017). A Study of Dynamic Phase Adaptation Using a Dynamic Multicore Processor. ACM Transactions on Embedded Computing Systems (Special Issue CASES 2017), ACM TECS.

(2016). Selecting Heterogeneous Cores for Diversity. ACM Transactions on Architecture and Code Optimization, ACM TACO.

(2016). Performance Portable GPU Code Generation for Matrix Multiplication. Proceedings of the 2016 Workshop on General Purpose Processing on Graphics Processing Units.

(2016). Matrix Multiplication Beyond Auto-Tuning: Rewrite-based GPU Code Generation. Proceedings of the 2016 International Conference on Compilers, Architecture and Synthesis for Embedded Systems.

(2016). Four Metrics to Evaluate Heterogeneous Multicores. ACM Transactions on Architecture and Code Optimization, ACM TACO.

(2016). Compositional Compilation for Sparse, Irregular Data Parallelism. Proceedings of the 2016 Workshop on High-Level Programming for Heterogeneous and Hierarchical Parallel Systems.

(2016). A Machine Learning Approach to Mapping Streaming Workloads to Dynamic Multicore Processors. Proceedings of the 17th ACM SIGPLAN/SIGBED conference on Languages, Compilers and Tools for Embedded Systems.

(2015). Runtime Code Generation and Data Management for Heterogeneous Computing in Java. Proccedings of the 12th International Conference on Principles and Practice of Programming on the Java Platform: Virtual machines, languages, and tools.

(2015). Generating Performance Portable Code using Rewrite Rules: From High-Level Functional Expressions to High-Performance OpenCL Code. Proceedings of the 20th ACM SIGPLAN International Conference on Funcational Programming.

(2015). Diversity: A Design Goal for Heterogeneous Processors. IEEE Computer Architecture Letters, IEEE CAL.

(2015). Carpet Unrolling Descriptors for Character Control On Uneven Terrain. Proccedings of 8th the ACM SIGRAPH Motion in Games Conference.

(2014). Measuring flexibility in single-ISA heterogeneous processors. Proceedings of the 23rd international conference on Parallel architectures and compilation.

PDF

(2014). Exploiting gpu hardware saturation for fast compiler optimization. Proceedings of Workshop on General Purpose Processing Using GPUs.

PDF

(2014). Community-driven reviewing and validation of publications. Proceedings of the 1st ACM SIGPLAN Workshop on Reproducible Research Methodologies and New Publication Models in Computer Engineering.

PDF

(2014). Automatic optimization of thread-coarsening for graphics processors. Proceedings of the 23rd international conference on Parallel architectures and compilation.

PDF

(2014). A Composable Array Function Interface for Heterogeneous Computing in Java. Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming.

PDF

(2013). Dynamic microarchitectural adaptation using machine learning. ACM Transactions on Architecture and Code Optimization, ACM TACO.

PDF

(2013). A large-scale cross-architecture evaluation of thread-coarsening. Proceedings of the 2013 Conference on High Performance Computing Networking, Storage and Analysis.

PDF

(2012). Exploring and predicting the effects of microarchitectural parameters and compiler optimizations on performance and energy. ACM Transactions on Embedded Computing Systems, ACM TECS.

PDF

(2012). Compiling a High-Level Language for GPUs (via Language Support for Architectures and Compilers). Proceedings of the 33rd ACM SIGPLAN Symposium on Programming Language Design and Implementation.

PDF

(2011). An empirical architecture-centric approach to microarchitectural design space exploration. IEEE Transactions on Computers, IEEE TC.

PDF

(2010). A Predictive Model for Dynamic Microarchitectural Adaptivity Control. Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

PDF

(2009). Rapid Early-Stage Microarchitecture Design Using Predictive Models. Proceedings of the 2009 IEEE International Conference on Computer Design.

PDF

(2009). Portable compiler optimisation across embedded programs and microarchitectures using machine learning. Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture.

PDF

(2008). Exploring and predicting the architecture/optimising compiler co-design space. Proceedings of the 2008 International Conference on Compilers, Architecture and Synthesis for Embedded Systems.

PDF

(2007). Microarchitectural Design Space Exploration Using An Architecture-Centric Approach. Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture.

PDF

(2007). Fast compiler optimisation evaluation using code-feature based performance prediction. Proceedings of the 4th International Conference on Computing Frontiers.

PDF

(2006). Automatic performance model construction for the fast software exploration of new hardware designs. Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems.

PDF

(2005). Java Byte Code synthesis for reconfigurable computing platforms. Master’s thesis.

PDF

(2005). Enabling unrestricted automated synthesis of portable hardware accelerators for virtual machines. Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis.

PDF