NIXSolutions: AI Models Face EU Compliance Challenges

A recent review of compliance with the European Union’s (EU) Artificial Intelligence Act (AI Act) revealed critical issues in AI models developed by companies like Meta, OpenAI, and others. These flaws include vulnerabilities to cyberattacks and biased inferences. Companies that fail to meet the new regulations could face hefty fines—up to €35 million or 7% of their global annual revenue.

The push for new AI regulations accelerated in 2022 after OpenAI launched ChatGPT, which triggered public concern about the risks of artificial intelligence. This led to the creation of the General Purpose AI (GPAI) rulebook, which aims to ensure AI technologies operate safely and ethically. The law will be phased in over the next two years, giving companies time to adjust.

NIXSolutions

LatticeFlow’s LLM Checker Highlights Key Gaps

The Large Language Model Checker (LLM Checker) was developed to help companies align with the AI Act’s requirements. This tool, created by LatticeFlow AI in partnership with ETH Zurich and the Bulgarian INSAIT Institute, evaluates AI models on technical reliability, security, and resilience against cyberattacks. Early feedback from EU officials has been positive.

Models are scored on a scale from 0 to 1. Data from LatticeFlow reveals that AI systems from Alibaba, Anthropic, Meta, and OpenAI scored 0.75 or higher on average. Despite these solid performances, significant weaknesses were found. For example, some AI models showed troubling biases in gender, race, and other categories. OpenAI’s GPT-3.5 Turbo earned a low 0.46 for discriminatory inference, and Alibaba’s Qwen1.5 72B Chat fared worse with a 0.37 score.

In terms of resilience to cyberattacks—particularly prompt hijacking—some models also underperformed. Meta’s Llama 2 13B Chat scored 0.42, while Mistral’s 8x7B Instruct scored 0.38. This highlights the need for further investments to strengthen AI models against such attacks, notes NIXSolutions.

Steps Toward Compliance and Next Moves

Anthropic’s Claude 3 Opus model, backed by Google, achieved the best overall performance with a score of 0.89, demonstrating strong compliance with the AI Act’s standards. To assist developers further, LatticeFlow has announced that the LLM Checker will be made freely available to ensure models meet legal requirements.

Petar Tsankov, LatticeFlow’s CEO, emphasized that although the test results showed room for improvement, they provide companies with a roadmap for compliance: “While the EU continues refining the rules, these early insights will help developers prepare for regulatory demands.”

The EU is still working out how the AI Act will apply to generative AI, with experts drafting a code of practice to guide the technology. We’ll keep you updated as these developments unfold.