British Standards Institution

Source:: ISO/IEC
Committee:: ART/1 - Artificial Intelligence
Categories:: Information management | Standardization. General rules | ICT | Information technology | Information technology. General

Scope

This document describes guidance for a comprehensive “AI inference framework for AI systems”. The framework includes AI inference concepts and terminology, AI inference modelling and components, AI inference framework processes, and AI inference performance. The AI inference framework is designed for generalized AI systems and is not specific to AI systems.

Purpose

This document describes the “AI inference framework for AI systems”.

Basically, learning and inference are two important processes for building AI systems. The learning process is a phase where AI models acquire knowledge by utilizing data. When AI models ‘learn’, there are several trial-and-error iterations until the results are satisfactory. On the other hand, the inference process is a phase where the trained modes are used to make decisions in real-world applications with many complicated ways. The component of AI inference includes data sources and data destinations through the process of AI systems. These two processes play crucial roles in enhancing the capabilities and performance of AI systems.

The learning phase is designing AI models effectively to acquire extensive knowledge. Learning is the phase that shapes and improves the capabilities of a model, while inference is the phase where the trained model is used to solve real-world problems and make predictions. It is equally crucial to focus on how to utilize the learned AI models to achieve efficient system behaviour during the inference phase.

Numerous intricate design and implementation challenges surface during the AI inference phase, encompassing aspects such as cost, computational efficiency, optimization, and deployment pipelines. The rise of API-based Software as a Service (SaaS) offerings, where AI inference is conducted as extremely high transaction volumes for consumer, commercial and enterprise applications, elevates the importance of AI inference considerations for stakeholders across the AI ecosystem.

Therefore, optimizing AI inference is crucial for improving efficiency, scalability, and cost-effectiveness. It accelerates the speed of machine learning models, enabling real-time responses and efficient resource use, particularly in large-scale AI systems. This optimization facilitates system scalability to accommodate user growth or increasing data volumes, leading the cost savings by utilizing resources more effectively and reducing infrastructure expenses.

While the training phase has an optimization method, AI inference optimization differs from the training phase. Optimization in AI inference is the process of refining the model and its operational environment to improve performance metrics like speed, accuracy, and energy efficiency, without compromising the core functionality. AI inference optimization includes key parameters such as quantization, pruning, model distillation, hardware acceleration etc.

No matter how optimized the learning process is, if optimization does not occur during the inference phase, the AI system will not operate efficiently, ultimately resulting in ineffective AI services. More than 90% of AI services depend on the performance of AI inference. Especially as AI systems become more hybrid and large-scale, the importance of the inference process is increasing.

Despite the importance of both processes, many standards have been developed with a predominant focus on the AI learning process in AI systems so far. Therefore, in this document, we aim to develop specifically the standard AI inference optimal process for the balanced efficiency of AI systems.

The relational standards in SC 42 for learning and inference processes in AI systems are in Figure 1 (see annex 1 - Figures).

ISO/IEC 22989:2022, ISO/IEC 23053:2022, and ISO/IEC DIS 5392 are about the building of AI models, while this document provides guidance on “how to use the AI models to get output with optimality” when some requirements or data are inputted.

The ToRs are as follows.

1) The concepts and terminology of AI inference framework

— Concepts and terminology of AI Inference for AI systems

— Overview of AI Inference framework for AI systems

— Terminologies

2) The modelling and components of optimal AI inference framework

— Levels of AI Inference

— Simple and hybrid AI inference modelling

3) The process of AI inference framework

— AI inference model optimization, testing, evaluation, deployment, system behaviour process and monitoring

— AI inference performance

Justification

Modern AI systems typically contain elements of both symbolic AI and subsymbolic AI. Such systems are called hybrid AI systems. This hybrid AI system combines the learned representations and statistical methods from the subsymbolic AI phase (training the model) with the logical rules and statistical techniques from the symbolic AI phase (language modelling). This integration allows the system to leverage both the learned patterns from data and the contextual understanding provided by symbolic rules to improve speech recognition accuracy. Recently, modern AI systems (hybrid AI systems) have evolved to large-scale AI, and symbolic AI is decreasing the functionality.

Indeed, with the recent emergence of generative AI, the field of artificial intelligence is undergoing a significant increase in complexity. There is a growing demand for hybrid AI systems that seamlessly integrate different approaches and technologies. The need to use various models flexibly across different platforms is becoming increasingly apparent. In this changing landscape, it is anticipated that artificial intelligence inference frameworks will provide important guidance and structure for the effective deployment and utilization of a diverse range of AI models in different contexts and applications.

There is no clear definition of “AI inference”, but the “inference” term is defined in ISO/IEC 22989:2022 as “reasoning by which conclusions are derived from known premises” (ISO/IEC 22989:2022, 3.1.17). In practical applications, AI encompasses various paradigms, and a framework for AI inference becomes essential due to the inherent complexity of AI systems.

The deployment phase, occurring after model training, is when the theoretical capabilities of a model transition into practical use. In this phase, AI inference assumes a critical role in real-world applications, influencing the effectiveness, responsiveness, and real-time decision-making abilities of AI systems. Inference optimization in AI applications like autonomous vehicles and natural language processing enables real-time effective decision-making and efficient resource use. By fine-tuning this process and maximizing resource efficiency, organizations can enhance AI performance, facilitating accurate, largescale deployments and unlocking the full potential of AI technology.

The position of AI inference framework to be developed in this proposal as a functional view of an AI system is in Figure 2 (see annex 1 - Figures)

The goal of this proposal is for the document to investigate and develop the AI inference framework for modern hybrid AI systems from an AI perspective, encompassing various AI paradigms. It aims to provide an effective guidance for AI inference framework for AI systems. This document specifically emphasizes the efficient generation of results through complex hybrid AI systems. The standard “AI inference framework” details on how to optimize AI inference from pre-trained AI models and how to evaluate AI inference models, etc. The overall process of AI inference is provided in the outline document.

FORM 4 – New Work ItFor example, let assume AI system involves many AI models and frameworks, when developing AI models in subsymbolic AI perspective, some groups may use frameworks like TensorFlow, Pytorch, and Keras etc., and different groups may use different frameworks to solve their specific problems. Different frameworks can be used in different AI models in modern AI systems. AI models may need to run in diverse environments including on client devices, at the edge, or in the cloud.

However, when running inference in production, these different AI models need to play work together. This phenomenon becomes more pronounced in complicated hybrid AI systems, such as large language models or generative AI, etc. At this point, an effective AI framework is crucial for achieving optimal performance in AI systems, including AI models. For an effective AI framework, some components such as parameters, throughput, model size, hardware dependency, operational environment should be optimized during inference process. Additionally, evaluation for the optimization should be conducted. This document provides a guidance for AI inference, covering the process from input to output using learned AI models based on AI perspectives.

The outlined AI inference framework is designed to enhance the understanding and implementation of AI inference in diverse applications of AI systems. AI inference brings several advantages such as real time decision, automation, efficiency, and cost reduction, in many industrial fields.

Building on foundational terms and concepts in ISO/IEC 22989:2022, ISO/IEC 23053:2022, and ISO/IEC DIS 5392, the document will provide necessary insights into aspects of the AI inference process specific to AI systems.

Consequently, this document acts as a guideline for modelling AI inference, optimizing its performance, and managing its resource demands. By addressing challenges related to efficiency, scalability, and model performance, it aims to promote the responsible and effective use of AI inference.

The advantage of the standard “AI inference framework for AI systems” is the improvement of the performance of AI system by establishing foundational concepts, and cost-effective provisioning perspective.

Comment on proposal

Required form fields are indicated by an asterisk (*) character.

How important do you think standardization is in this area? *

Do you agree that a standard on this subject is feasible?: *

Please give reasons for your selections above: *

Would you or your organization use the standard? *

Would you or your organization be prepared to participate in the development of the standard or to comment on the draft when it is available? *

Are you aware of any regulation, existing standards and other good practice information in this area in the UK (e.g. Industry codes of practice; company specifications, international or European Standards)? *

Would the development of a standard(s) in this area have a particular impact/relevance to SME, consumer, environmental or societal interests? *

Are you aware of any other organizations to which this proposal may be relevant? If so please provide details below. *

Additional comments on the scope or proposal:

Although BSI will not usually enter into correspondence regarding individual comments or suggestions we may wish to contact you to seek further clarification. Please indicate whether this will be acceptable: *

Discard

Please email further comments to: debbie.stead@bsigroup.com

Standard timeline

1. Proposal

Proposal start date:

22/11/2023

Proposal end date:

14/01/2024

2. Draft

3. Public Comments

4. Comment Resolution

5. Approval

6. Publication

Learn more about the standards development process

Standards Development

ISO/IEC PWI TS 42110 Information technology — Artificial intelligence — AI inference framework for AI systems

Scope

Purpose

Comment on proposal

Discard changes?

Submit comment

Submit comment

Follow standard

Unfollow standard

Error