OpenAI launches GPT-5.6: Sol, Terra, and Luna in limited preview

TL;DR: OpenAI launches the preview of GPT-5.6 with three models: Sol (flagship), Terra (balanced, competes with GPT-5.5 at half price), and Luna (fast and cheap). The preview is limited by agreement with the U.S. government. General availability expected in weeks.

What happened?

OpenAI has begun a limited preview of the GPT-5.6 series, which includes three models: Sol, the flagship model; Terra, a balanced model for daily work; and Luna, a fast and affordable model. According to the company, Terra offers competitive performance with GPT-5.5 at half the price, while Luna provides solid capabilities at the lowest cost in the family. The preview is being conducted with a small group of trusted partners, whose names have been shared with the U.S. government at its request. This move is not unprecedented: at the launch of GPT-4, OpenAI also conducted a limited preview with safety partners, but transparency with the government is an additional step. Government involvement reflects increasing regulatory scrutiny of frontier models, similar to what happened with Biden's AI executive order in 2023 and congressional hearings. Unlike previous launches where the preview was more open, here OpenAI has prioritized regulatory compliance, which could delay mass adoption but also build institutional trust.

Pricing and cache details

Prices per million tokens are: Sol: $5 input / $30 output; Terra: $2.50 / $15; Luna: $1 / $6. In comparison, GPT-5.5 cost $10/input and $40/output, so Terra represents a 75% reduction in input and 62.5% in output compared to its direct predecessor. Luna, meanwhile, is 90% cheaper in input than GPT-5.5. These prices directly compete with Anthropic's Claude 3.5 Sonnet ($3/input, $15/output) and Google's Gemini 1.5 Pro ($3.50/input, $10.50/output), but Sol remains more expensive than these. Additionally, GPT-5.6 introduces a more predictable prompt caching system, with explicit breakpoints and a minimum lifetime of 30 minutes. Cache writes are billed at 1.25x the uncached input rate, while reads maintain a 90% discount. This is a significant improvement over GPT-5.5's cache, which was less predictable and had a shorter minimum lifetime. For developers processing repetitive prompts (e.g., in chatbots or log analysis), this system can reduce costs by up to 50% if optimized correctly. The novelty of breakpoints allows developers to force cache invalidation, preventing errors in applications requiring fresh responses.

Why is this important?

This preview marks a milestone in OpenAI's strategy to democratize access to AI, offering scalable options that fit different budgets and use cases. Government involvement reflects increasing regulatory scrutiny of frontier models. Additionally, the improved caching system can significantly reduce operational costs for developers and businesses. Historically, OpenAI has followed an aggressive pricing strategy: GPT-3.5 cost $0.002/1K tokens in 2022, GPT-4 dropped to $0.03/1K in 2023, and now GPT-5.6 further reduces costs. This pressures competitors like Anthropic and Google to adjust their prices, benefiting startups and companies that rely on LLMs. However, the limited preview could create an access gap: selected partners (possibly large tech companies) could gain temporary competitive advantages, while SMEs will have to wait weeks. Moreover, transparency with the government could set a precedent for future frontier model launches, aligning with recommendations from bodies like the AI Safety Institute.

Market implications

The GPT-5.6 series pushes down prices in the LLM sector, intensifying competition with Anthropic, Google, and others. The three-model segmentation allows OpenAI to capture both premium customers and budget-conscious developers. The limited preview could create temporary scarcity and expectations about Sol's actual performance. According to analysts, if Sol outperforms GPT-5.5 on benchmarks like MMLU or HumanEval, it could consolidate OpenAI's leadership, but if results are marginal, competitors could gain ground. Additionally, the introduction of predictable caching could become an industry standard, forcing Anthropic and Google to implement similar systems. In the LLM API market, a price war similar to 2023 is expected, when OpenAI cut prices after the launch of Claude 2. Startups using OpenAI models could see reduced operational costs, but will also face pressure to quickly migrate to the new models once available, incurring integration costs. On the other hand, the limited preview could delay adoption in regulated sectors like healthcare or finance, which require more thorough safety evaluations.

What readers should know

Prices are competitive, but the preview is only for selected partners. General availability is expected in weeks. The caching feature with breakpoints is a novelty that can optimize costs in applications with repetitive prompt patterns. Developers should prepare to migrate to these models once they are publicly available. It is important to monitor independent benchmarks that emerge during the preview, as they will determine whether Sol truly justifies its premium price. Additionally, transparency with the government could imply that OpenAI is sharing partner usage data, raising privacy questions. Companies planning to use GPT-5.6 should evaluate whether the new caching system requires changes to their prompt architecture. Finally, since OpenAI has promised general availability in weeks, developers can start testing with current models and plan migration for when the new ones are available, taking advantage of reduced prices and caching improvements.

OpenAI launches GPT-5.6: Sol, Terra, and Luna, three models for every need

What happened?

Pricing and cache details

Why is this important?

Market implications

What readers should know

Keep reading