OpenAI for Government: A Bold Bet on AI in the Public Sector — But Are We Ready?

By Courtney King | August 6, 2025

Last week, OpenAI launched a new initiative called OpenAI for Government to bring their most advanced models, including ChatGPT Enterprise and ChatGPT Gov, to federal, state, and local public servants across the United States. The announcement signals a major strategic push into the public sector, with OpenAI partnering with agencies like the Department of Defense, NIH, NASA, and even the Treasury. It comes packaged with promises of cutting red tape, modernizing service delivery, and empowering civil servants to do more of what they were hired to do.

It all sounds promising. But here’s the uncomfortable truth: the U.S. federal government is nowhere near equipped to safely, effectively, or ethically deploy frontier AI tools at this scale.

The intention is clear. The infrastructure is not.

The Gap Between Tech Ambition and Institutional Readiness

Government adoption of transformative technologies is nothing new. But AI presents a fundamentally different risk profile than, say, cloud computing or even cybersecurity modernization. These systems are probabilistic, opaque, prone to hallucination, and deeply dependent on context. They reflect their training data, inherit social biases, and operate in ways that even their creators don’t fully understand.

And yet, despite these well-known challenges, we are preparing to roll them out to tens of thousands of civil servants, many of whom do not have the training, infrastructure, or governance frameworks needed to use them responsibly.

This is not a knock on public sector workers. In fact, the best among them are hungry for innovation. But the systems they operate within (procurement, compliance, data management, IT policy, and regulatory review) were not built to support the speed and ambiguity of AI deployment. There are several critical weaknesses:

  • Lack of AI governance infrastructure: Most federal agencies do not have internal policies, risk assessment tools, or review boards equipped to evaluate the ethical or operational impact of generative AI systems.

  • Data readiness issues: LLMs need high-quality, well-structured data to produce reliable results. Government systems are often siloed, outdated, and riddled with inconsistent formats.

  • Procurement challenges: Federal procurement cycles are notoriously slow and rigid. Evaluating and procuring cutting-edge AI solutions while ensuring security, accessibility, and fairness is a minefield.

  • Low technical fluency among decision-makers: Many leaders in government are still getting up to speed on what generative AI even is, much less how it might shift their operating models.

The Danger of Premature Scaling

The pilot project with the Department of Defense’s Chief Digital and Artificial Intelligence Office (CDAO) suggests a massive appetite for rapid deployment. But scale without discipline is dangerous.

When an AI tool is handed to a civil servant with minimal training and no guardrails, the risk isn’t just error or inefficiency. It could result in:

  • Misuse of sensitive data

  • Automation of harmful or biased decisions

  • Inadvertent policy violations

  • Escalating trust issues among the public

Consider what happens when generative AI is used to summarize benefit applications, route constituent emails, or flag cases for fraud. These are high-stakes decisions. Errors aren’t just bugs; they have real human costs.

What Needs to Be True Before This Works

We don’t have to slam the brakes on government AI adoption. But we do need to be honest about where we are and what has to be built. Here are five foundational capabilities federal agencies need before frontier tools like ChatGPT can be deployed responsibly:

  1. Agency-specific AI governance frameworks

    • With clear roles, responsibilities, and review processes for AI deployment

  2. Robust risk and bias auditing

    • Including external oversight of models used for high-impact decisions

  3. Workforce training and literacy

    • Not just prompt engineering 101, but understanding limitations, risks, and escalation protocols

  4. Strong data infrastructure and access controls

    • Secure, high-quality, and equitable data pipelines are essential for meaningful output

  5. Human-in-the-loop systems by default

    • With mandatory human oversight for any automated recommendation or decision

The Role of Civil Society and Independent Advisors

The federal government shouldn’t be left alone to navigate this. Civil society orgs, researchers, and independent advisors have a critical role to play in shaping how AI is introduced into the public sector. We need more public consultations, more independent audits, and more transparency around what tools are being used and why.

That’s where we come in.

Stillwater helps governments assess their AI maturity, build internal guardrails, and deploy tools like ChatGPT with safety, compliance, and impact in mind. We know how to bridge the gap between tech ambition and operational reality.

OpenAI for Government is a bold move. But boldness alone doesn’t build trust. That comes from preparation, integrity, and clear boundaries.

Let’s help government get this right.

Next
Next

10 Takeaways from Trump’s AI Action Plan