Autonomous Quality in AI Model Productization
Autonomous quality in AI model productization is the key to scalable machine learning.
Intel IT AI is a group of over 200 data scientists, machine learning (ML) engineers and AI product experts. We provide high business impact and transform Intel’s inner processes with AI, including engineering, manufacturing, hardware validation, sales and performance. In the past decade, we have deployed 500 AI models, including more than 100 ML solutions in the last year. Our tolerance to model quality degradation is extremely low because our solutions have evolved to become intrinsic, integral parts of Intel’s business-critical activities.
Embedding and enabling AI across Intel’s vital business operations allowed us to deliver over USD 1.56B in value during 2021. Our team has learned that to scale AI effectively, we cannot view projects on an individual level. To truly achieve AI model quality, we must thoroughly understand the AI quality lifecycle and solve the challenges of maintaining the quality of AI models at scale, from experiments through productization and the maintenance phase (post-deployment to production).
While maintaining quality for a few models is easy, once scale comes into play, the challenge grows significantly due to modifications in AI models that involve more than just a code change. Our primary concern is that maintaining the quality along the lifecycle becomes so time-consuming to the point that eventually our teams will not be able to address any new projects. Embedding industry-standard quality‑control principles and tools into our AI pipeline allows quality to become ubiquitous and near-autonomous for the AI model across development, deployment and execution.
Once AI models deploy to production, quality and performance often degrade over time. Predictions become less precise, and degradation may harm the business processes that rely on those predictions. Therefore, it is critical to track models in production, monitor their health, and respond to problems as they arise. We incorporated a subsystem in Microraptor that enables the development, deployment and management of all model quality-control metrics. We use it for alerting and actuation to trigger an automatic retrain or, if necessary, human attention.
By applying the following IT-AI principles for software quality we saw the following results:
- Enable quality to be easily embedded as part of the model/solution code.
- Support scale of handling quality issues, ideally without human intervention.
- Base decisions and projects on high-quality data.
- Support efficient, highly automated work within domains (product AI, data scientists and ML engineers) and across domains.
- Focus on bringing high value through quality investment with customer satisfaction, reliability, efficiency and agile feedback loops.
We continue to refine our strategy and platform applications to materialize our vision for autonomous quality in AI model productization. As we implement, we plan to continue working with Intel’s business units and design teams to implement our principles and put automation, AI, and data to work to support Intel’s quality to steadily expand to support our scale in business.