The GPT-OSS-120B model runs efficiently on a single 80 GB GPU, while the smaller GPT-OSS-20B model can run on edge devices with just 16 GB of memory, “making it ideal for on-device use cases, local inference, or rapid iteration without costly infrastructure,” according to OpenAI.
The launch came just two days before the company unveiled GPT-5. While GPT-5 grabbed mainstream headlines, GPT-OSS delivers something entirely different: full model access, the ability to run locally, offline, and entirely outside OpenAI’s infrastructure, and unrestricted fine-tuning.
Flexible, Easy to Run, Fully Customizable
In practice, this means complete control over deployment, the freedom to fine-tune with proprietary datasets, full data privacy with all client content kept in-house, and no recurring per-token fees once the model is running.
OpenAI said the new models are “flexible and easy to run anywhere — locally, on-device, or through third-party inference providers.” Coupled with its hosted offerings, this gives developers the ability to “choose the performance, cost, and latency they need to power AI workflows.”
“For developers who want fully customizable models they can fine-tune and deploy in their own environments, GPT-OSS is a great fit,” the company said. “For those seeking multimodal support, built-in tools, and seamless integration with our platform, models available through our API platform remain the best option,” they added.
Mixed Early Reviews on Translation
According to OpenAI’s model card, GPT-OSS-120B at high reasoning delivers competitive results on multilingual understanding across 14 languages, coming close to the company’s proprietary o3-mini.
Early user feedback has been mixed. Some reported that GPT-OSS “excels in translation compared to the other smaller models,” with results […] matching premium closed models” in translation and multilingual information extraction tasks.
Others criticized its speed and performance in some languages, citing “questionable” Japanese output and German text with grammar and spelling errors, claiming it “can’t write good non-English” and even seems “worse than Llama 3.3 70b.”
Other German translation tests were more positive, with the smaller model outperforming LLaMA 3.x and coming surprisingly close to ChatGPT 3.5, though they noted that ChatGPT 3.5 itself is not considered a top-tier translation model.
Chris Hayduk, Machine Learning Engineer at Meta, shared that when he used it for translation, it “stumbled with really awkward/incorrect phrasing and grammar,” which he attributed to “worse knowledge of grammar and phrasing for non-English languages due to the smaller base model.”
Slator 2025 Language Industry Market Report
The 150-page report offers a comprehensive view of the 2025 global market — with market sizing, AI capability breakdowns, buyer insights, use cases, survey data, and projections through 2030.
The Importance of Proper Testing
Lilt CEO Spence Green shared on LinkedIn some quick benchmark results pitting GPT-OSS 120B against Google Translate, the open-source multilingual EMMA 500 model, and Lilt’s own AI translation model. Closed models still lead in quality, but GPT-OSS emerged as the most capable open-source multilingual model in the comparison.
“It is AWESOME to have an OSS multilingual model of this capability for all sorts of multilingual tasks,” Green said, promising to share GPT-5 results soon.
Gert Van Assche, CTO at DATAmundi (formerly Summa Linguae Technologies, rebranded in April 2025), highlighted the importance of proper testing before deploying new models, warning that “switching an LLM in production isn’t just a technical swap, it carries risk. New models might be faster or cheaper, but not always better or safer.”
For those eager to experiment, OpenAI has made the models available in its open model playground, along with documentation in the OpenAI Cookbook for fine-tuning and local deployment. “Forget about the GPT-5 drama and start doing cool stuff!!!” Eduardo Ordax, Generative AI Lead at AWS, said.