Kubernetes' agility, versatility, and resource scaling make it a platform of choice for data science, especially for shared environments. However, data scientists often need to work with lots of different libraries, languages, and applications, often with multiple versions. Conventional approaches, with a legion of tailored images or a huge 20GB golden image, do not match the reality of production. In this session, we will demonstrate how you can leverage the concept of environment modules inside Kubernetes to solve the challenges of synchronously managing multiple containers of different types, making thousands of scientific libraries, languages and packages dynamically available in a simple way. Inspired by work done and heavily used in the High Performance Computing (HPC) community, we will share a specific implementation that brings this production-proven architecture to Kubernetes and talk about how you can implement it in your own environment.
Click here to view captioning/translation in the MeetingPlay platform!