AI Interpreting Solutions Evaluation Toolkit

In an August 27, 2025, press release, the Stakeholders Advocating for Fair and Ethical AI in Interpreting Task Force (SAFE AI), in partnership with the Coalition for Sign Language Equity in Technology (CoSET), announced the release of part A of their three-part “AI Interpreting Solutions Evaluation Toolkit.”

Part A: Organization, Implementation and Management” of the new toolkit is aimed at providing early adopters of AI interpreting with a “comprehensive, risk-informed approach for evaluating AI and hybrid AI-human language access solutions.”

The SAFE AI Task Force, founded in 2023 by 11 industry stakeholders, advocates for fair and ethical use of AI in interpreting. Their website lists academics, interpreters, interpreting trainers, and language access directors from various institutions and Language Solutions Integrators (LSIs) among the “assembly members” of the Task Force.

CoSET, formerly the Advisory Group on AI and Sign Language Interpreting (AG), was formed in 2023 to provide deaf expertise to the SAFE AI Task Force. According to the press release, they focus on “advanc[ing] sign-language equity in AI by setting standards and amplifying Deaf expertise as an independent partner to the [SAFE AI] Task Force.”

What’s in the Toolkit?

Part A of the new AI interpreting toolkit is broadly divided into two essential components: a broad contextual overview and a detailed practical guide.

The overview component is presented as the “why” motivating the toolkit’s “risk-informed” perspective. It presents three “decision pillars”: organizational readiness, technical fitness, and total cost of adoption, which organizations are advised to consider before adopting AI interpreting solutions. 

These pillars then form the basis for the more detailed framework presented in the practical component of the toolkit.

A section covering “foundational considerations for AI implementation” establishes the context for the practical guide that follows. It urges organizations to consider both the opportunities and potential risks when adopting AI tools, highlighting issues such as biased training data, which may result in asymmetrical translation quality, likely performing worse in low-resource languages. 

These risks should then prompt organizations to perform internal pilot testing, making sure that tools not only offer the needed language pairs, but can also perform adequately for an organization’s specific needs in “real-world conditions.”

Interestingly, the changing legal landscape for language access in the United States is also highlighted. Taking a long-term perspective, the document points out that underlying state and federal laws remain codified despite fluctuating enforcement and funding. 

The contextual overview also advises organizations to consider “current budget pressures and the potential costs of rebuilding services when priorities shift again” and to collaborate across internal departments for input on adoption and long-term management of tools.

Five Practical Checklists

The second major component of Part A of the toolkit offers a practical, in-depth guide in the form of five checklists designed to help organizations systematically evaluate AI interpreting solutions. 

The checklists address the full adoption process from internally evaluating an organization’s readiness for AI adoption to preparing appropriate requests for proposals (RFPs) that include requests for AI interpreting.

These instruments fall broadly into two formats. The first one employs a binary system where the user marks each listed criterion as “ready” or “not ready,” then tallies the totals to aid in decision making, while the second format relies more on qualitative examples and guidance.

For example, a “risk factor evaluation matrix” informs the evaluation in checklist three. It provides examples of risk factors organized by severity to help organizations classify the use of AI interpreting in a particular use case on a five-point scale from “no risk” to “high risk.” 

Notably, the toolkit does not specifically define a “low risk” versus “high risk” scenario, nor what an organization should identify as “low complexity” versus “high complexity” communications.

The overview section expressly addresses this choice, saying, “this Toolkit is designed to enable your organization to assess the risks of early adoption based on your institutional context.”

At the time of writing, parts B and C of the toolkit, covering “technical specifications” and “legalities and practical considerations” have not yet been released.