📊 Full opportunity report: AMÁLIA · The Three Hard Questions. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
Portugal’s AMÁLIA, a €5.5 million European Portuguese language model, is operational and outperforms many benchmarks but raises three key questions about its openness, native data, and objectives. These questions highlight broader issues in Europe’s sovereign LLM efforts.
Portugal’s €5.5 million investment in the AMÁLIA project has resulted in a functioning European Portuguese language model, which is now accessible to hundreds of thousands of users and outperforms many benchmarks. However, experts are raising three critical questions about the model’s openness, native-language data, and primary goals, which have significant implications for Portugal and broader European AI sovereignty efforts.
The AMÁLIA project, involving approximately 60 researchers from Portugal’s top institutions, was announced in December 2024. Its base version was completed by September 30, 2025, and is publicly available through the FCT’s IAedu platform, serving 450,000 academic users. The model is built as a continuation of the EuroLLM multilingual foundation, rather than training from scratch, and handles text-only input with plans for multimodal capabilities.
Technical benchmarks show that AMÁLIA outperforms previous fully open models on European Portuguese tasks and beats models like Qwen 3-8B on most benchmarks, although it still trails on some specific tests such as ALBA. The model’s knowledge base ends in 2023, with final refinements expected by June 2026. The approach emphasizes leveraging existing multilingual foundations, contrasting with Italy’s Minerva, which was trained from scratch on Italian and English data.
Despite these technical achievements, public discourse, notably from researcher Duarte O.Carmo, highlights three unresolved questions: How open is ‘fully open’ really? How much native-language data is enough? And what should the model be optimized for? These questions are not accusations but serve as a framework for evaluating national AI investments and their strategic implications.
AMÁLIA
The three hard
questions.
Portugal spent €5.5M to build a European Portuguese LLM. The base version is operational, the benchmarks beat Qwen 3-8B on most pt-PT tasks. So why are the most important questions still unanswered?
Last month, Duarte O.Carmo published the sharpest public analysis of AMÁLIA — Portugal’s state-funded European Portuguese large language model. He prefaces his critique with the necessary diplomatic apparatus before doing what almost nobody else in the European-sovereign-LLM discourse has been willing to do publicly: asking hard questions about whether the work, as released, actually does what it set out to do. This piece is a structural extension of his analysis. The AMÁLIA case study exposes three hard questions every national LLM effort needs to answer publicly — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
Three questions every national LLM effort needs to answer publicly.
Duarte O.Carmo’s framing maps cleanly onto the structural argument. Each question lands specifically in AMÁLIA — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
The three questions form a structural feedback loop. Q3 (optimization target) determines Q2 (data volume needed) which conditions Q1 (openness sufficient for community contribution). The European sovereign-LLM movement collectively benefits from these questions becoming standard methodology disclosure, not exceptional critique.

Romans in a New World: Classical Models in Sixteenth-Century Spanish America (History, Languages, And Cultures Of The Spanish And Portuguese Worlds)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
107 billion tokens. 5.8 billion clearly pt-PT.
The structurally tractable question with a structurally surprising answer. For a model whose entire stated purpose is European Portuguese prioritization, the native-language share of extended pre-training is 5.5%. The implications cascade into every other question.

Multilingual AI Translation Mastery: Building Accurate, Culturally Sensitive Language Tools and Global Communication Systems in 2026
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
The Olmo standard. AMÁLIA’s current state.
Allen Institute for AI’s Olmo project defines what “fully open” operationally requires. Olmo doesn’t lead frontier benchmarks. That’s not the point. The point is to be the structural reference for openness. AMÁLIA’s “fully open source” claim should track to the operational standard.

Rebooting the Machines: A New Human Vision for Artificial Intelligence
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Four strategic positions. AMÁLIA between two and three.
Approximately €100M+ in publicly disclosed European sovereign-LLM funding across the major initiatives. The structural question every project faces: what is the actual competitive position you’re staking? Four options — none mutually exclusive — but each requiring different commitments.

Large Language Models: The Hard Parts: Open Source AI Solutions for Common Pitfalls
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three standards. For AMÁLIA and the movement.
The structural critique generalizes beyond AMÁLIA. Italy, France, Germany, Switzerland, the OpenEuroLLM consortium, and every subsequent national project benefit from public discourse holding national LLM efforts to operational standards on openness, data accounting, and strategic positioning.
The European sovereign-AI agenda is a serious strategic project that deserves serious public discourse. O.Carmo’s analysis is what serious public discourse looks like. Appropriately diplomatic. Structurally rigorous. Willing to ask the hard questions in public when the public investment justifies it. More of this is needed — across every European sovereign-LLM project, not just AMÁLIA.
Why the Three Questions Matter for European AI Sovereignty
The three questions raised about AMÁLIA are emblematic of broader challenges facing European efforts to develop sovereign AI models. They highlight tensions between openness, native-language data sufficiency, and strategic goals. Addressing these questions is crucial for ensuring that investments like Portugal’s €5.5 million lead to truly autonomous, transparent, and effective language models that serve national interests and reduce dependency on non-European AI providers.
Failing to clarify these issues could result in models that are less open than claimed, underperform due to insufficient native data, or misaligned with national priorities. Conversely, transparent answers could strengthen Europe’s position in AI development, foster innovation, and promote responsible AI practices aligned with regional values.
European Sovereign LLMs and the Structural Questions
The development of European-language large language models (LLMs) is a strategic priority across several countries, including Italy, Germany, France, Switzerland, Norway, and Sweden. Most projects, including Portugal’s AMÁLIA, are operating with similar structural questions: the level of openness, native data sufficiency, and primary objectives. These issues have largely been discussed in technical circles but remain underexplored in public discourse.
Many models are based on multilingual foundations, with varying strategies for native-language training—some train from scratch, others build on existing multilingual models. The European approach emphasizes sovereignty, transparency, and regional data use, but the lack of clear standards or benchmarks for openness and native data sufficiency complicates evaluation and comparison. The case of AMÁLIA exemplifies these broader dynamics, as it is both a technical achievement and a test case for European AI policy debates.
While the initial technical results are promising, the broader questions about strategic priorities and transparency remain open, and the next 12-24 months will be critical for clarifying these issues across the continent.
“The AMÁLIA project exemplifies the structural questions every European sovereign model must answer: how open is truly open, how much native data is enough, and what are we optimizing for?”
— Duarte O.Carmo
Unanswered Questions About AMÁLIA’s Openness, Data, and Goals
It is not yet clear how the Portuguese government and researchers will address the three core questions publicly. The final version of AMÁLIA is expected in June 2026, and ongoing discussions suggest that some issues—particularly regarding model openness and native data sufficiency—are still under deliberation. The extent of future multimodal capabilities also remains uncertain, as plans are still developing.
Moreover, it is unclear how these questions will influence broader European policy and funding strategies, or whether other national projects will adopt similar approaches or diverge significantly.
Next Milestones for AMÁLIA and European Sovereign LLMs
The upcoming months will see further refinements of AMÁLIA, with the final version expected in June 2026. Researchers and policymakers will likely address the three core questions more explicitly, potentially setting standards for openness, native data usage, and strategic objectives across Europe. Public evaluations and independent audits may also emerge, providing greater transparency and accountability.
Additionally, other European projects are expected to publish their progress, which will allow for comparative analysis and strategic adjustments. The broader discourse around sovereignty, openness, and native-language AI will intensify as these models mature.
Key Questions
What are the main technical achievements of AMÁLIA so far?
AMÁLIA has demonstrated superior performance on European Portuguese benchmarks, outperforming previous open models and most benchmarks like Qwen 3-8B on several tasks, with knowledge updated through 2023.
Why are the questions about openness, native data, and goals important?
They are crucial for ensuring that the model is transparent, effective, and aligned with national and regional interests, reducing dependency on non-European AI providers and fostering trust and strategic autonomy.
When will the final version of AMÁLIA be available?
The final version is expected in June 2026, after further refinements and evaluations.
Are these issues unique to Portugal?
No, similar questions are being raised across Europe regarding other sovereign LLM projects, reflecting a common strategic challenge in the continent’s AI development.
What impact could these questions have on Europe’s AI future?
Addressing these questions could lead to more transparent, effective, and autonomous models, strengthening Europe’s position in AI and setting standards for responsible development.
Source: ThorstenMeyerAI.com