OneOligo: Use oneAPI to Accelerate DNA Data Storage
Speaker: Raja Appuswamy, EURECOM
In the European Commission-funded Future and Emerging Technologies initiative OligoArchive, we are working on transforming DNA–the biological building block of life–into a digital building block for long-term data archival. One of the key steps in retrieving digital data stored in DNA involves clustering billions of strings with respect to edit distance. The computationally intensive nature of edit distance computation has made this step a critical bottleneck in the DNA data retrieval pipeline. In this talk, we present project OneOligo—our scalable, hardware-accelerated solution for DNA read clustering. In doing so, we first provide an overview the DNA data storage pipeline. Then, we present OneJoin—a string-similarity join algorithm that synergistically combines algorithmic advances in low-distortion embedding with cross-architectural programming ability offered by DPC++, to scale-up clustering across CPUs and GPUs.
Additional Resources
Great Cross-Architecture Challenge—A Coding Challenge
Calling all C++, DPC++, and CUDA developers. We’re searching for the next oneAPI hero—someone who can write code that will run on the latest CPUs, GPUs, and FPGAs. Submit your best projects to win some amazing prizes.
Supercomputing 2020 (SC20) Recorded Sessions on oneAPI
- C++ for Heterogeneous Programming: oneAPI
- Performance Tuning with the Roofline Model on GPUs and CPUs
- Panel: The oneAPI Software Abstraction for Heterogeneous Computing
Self-paced Trainings Using Jupyter* Notebooks
Sign Up for Intel® DevCloud for oneAPI
Join
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.