Responsible to support data science projects and solutions by leveraging service engineering and ML Ops experience to solve a variety of use cases across the Group and for its customers. Expected to be highly skilled in setting up and supporting a MLOps practice and framework with the
ability to design, build and scale MLOps components and services for new and existing use cases across the group in a cloud environment.
• Engage with stakeholders to support the design and delivery of data science projects and solutions.
• Use service engineering and MLOps techniques to solve business problems.
• Responsible for working with a team of MLOps engineers to develop and maintain our cloud-based ML and AI development and production platforms.
• Design, build and maintain MLOps repositories, pipelines and components for cloud-based model processing and serving.
• Support MLOps framework processes for exploratory data analysis and solution development.
• Lead and develop a team of junior MLOps engineers.
• Contribute to our agile way of work and our innovation culture.
• Up to date knowledge of ML platforms and related technologies.
• Translate business requirements into system requirements.
• Consistent documentation of all implemented ML models, systems and processes.
• Support tools, ML models and infrastructure lifecycles via standard service management principles and processes.
• Execute on automation directives by taking repeatable tasks, writing code to replace repeatable tasks, and then adding either a scheduler or some other trigger to enable the job to run automatically while being monitored.
• Enable ML and workload orchestration by configuring and controlling systems that can scale horizontally using specialized tools and techniques.
• Enable ML and component containerization by isolating individual services into containers, allowing them to run anywhere.
• Enabling containers to run and scale horizontally using orchestration tools.
• Execute on ML engineering directives by using engineering techniques to automate and scale the ML model life-cycle and host models in a production environment
REQUIRED CERTIFICATION/PROFESSIONAL REGISTRATION
• Data and cloud certifications will be advantageous (GCP, Azure, AWS) as well as certifications for other products in our stack (Kubernetes, Istio, FastAPI,
• Docker, PyTorch, Yaml)
• 3-year degree/ diploma (NQF level 6) preferably in Computer Science, Mathematics, Statistics, Machine Learning or a related field. A relevant post graduate degree will be an added advantage
• 3-5 years relevant experience, of which at least 2 years must have been in a machine learning operations environment. Experience in ICT/
• Telecommunications will be an advantage. Experience with system and process analysis and design.
• Experience with Google Cloud Platform.
• Expected to stay abreast of new machine learning frameworks and developments and to put them into practice:
• Relational and non-relational database foundational knowledge
• Machine learning operations knowledge base
• ML concepts, terms and frameworks;
• ML engineering knowledge (Python, Scala, Shell scripting, Kubernetes, Fast API, Cloud Run, Docker, PyTorch, Yaml)
• Cloud ML Ops knowledge (Cloud Source Repositories, Automated Build and Release Management, Container Registry Management and Secure Endpoint Orchestration)
• ML Ops CI/CD and cloud deployment best practises
• Cloud containerisation knowledge, standards and deployment options
• Cloud computing and platform management (GCP, Azure, AWS, etc.)
Must have a minimum of 3/4 years' experience within a Management position within high-end fashion or hospitality industry.Must have good attention to detail.