OSD's data-driven oversight of the military services and an alternative path forward

GAO’s Shelby Oakley: I think the biggest thing that I see right now that is a challenge for Ms. Cummings’ office is their ability to, in fact, conduct data-driven oversight, because oversight is still important. Even if the Services are running the show for their own programs, DoD, as the portfolio manager — OSD as the portfolio manager of all of the weapon programs has to play a role in determine what we are investing in and understanding and deciding what those risks are and where they are at, and are the Services taking appropriate risks. And they are working real hard to try and figure out what that data is that they need from the Services to be able to do that kind of oversight, but, you know, the Services are feeling empowered and in charge, and I think that there is a little bit of a struggle there in terms of what data and how they are going to get transparent data from the services to be able to make smart decisions from an OSD level for those programs…

USD(A&S)’s Stacy Cummings: We also need time to standardize the data element, so we understand what data means, and then the systems to be able to analyze it. Right now our analytics is very, very people-focused. We want to move it to be system-focused. We want to be able to take machine learning, put data into a system where we can look at it more holistically for trends, as opposed to the way we historically have looked at data, which is to dive in, and I think, Senator Sullivan, you brought this up, for us to be able to dive into individual programs.

That was from an April 28, 2021 SASC hearing, Defense Acquisition Programs and Acquisition Reform. Pointer from an FCW article about it.

I think the framing of the problem as Oakley described is one of power. Delegation of acquisition authority to the services since 2016 has created fiefdoms of the parochial services. Cummings asserted OSD has the authority to acquisition data and is making progress with its systems. In her written testimony, Cummings pointed to progress on the Acquisition Visibility Data Framework (AVDF) that will provide standards for reporting from the six separate pathways that ultimately will be made available through ADVANA.

Here’s the fundamental problem as I understand it: The more standardized the data are, the less context and nuance they convey, the less meaningful will be the insight derived from the data no matter how sophisticated the techniques. Here’s a nice bit from FA Hayek’s Uses of Knowledge in Society:

…the sort of knowledge with which I have been concerned is knowledge of the kind which by its nature cannot enter into statistics and therefore cannot be conveyed to any central authority in statistical form. The statistics which such a central authority would have to use would have to be arrived at precisely by abstracting from minor differences between the things, by lumping together, as resources of one kind, items which differ as regards location, quality, and other particulars, in a way which may be very significant for the specific decision. It follows from this that central planning based on statistical information by its nature cannot take direct account of these circumstances of time and place and that the central planner will have to find some way or other in which the decisions depending on them can be left to the ‘man on the spot.

For example, Oakley described how the Selected Acquisition Reports (SAR) would no longer be a statutory requirement after FY 2021. While the SAR only applies to the largest ACAT I programs, we can imagine if there were a SAR reported for ACAT IIs and IIIs as well. But insight into high-level requirements, cost, and schedule is only a valid method of oversight if the baseline is “correct” in all of its particulars.

Even granular level knowledge like the cost and schedule of every particular activity that may be in an integrated master schedule tells one nothing about whether that was (1) the right thing to be doing; and (2) whether it was being done the right way. In other words, modern oversight focuses on distribution of money costs rather than valuations of opportunity cost. What choices were made, what were the alternatives, and was the course of action wise?

Most of modern oversight uses the acquisition baseline as a source of enduring truth. But that puts a premium on prediction and control, and presumes full knowledge is available prior to Milestone B when full-scale development gets started — the rest is just execution to a plan. That’s why so much time and effort is spent planning. And that’s why cost growth is a proxy for value delivery.

Clearly, the idea of incremental investment decisions is antithetical to program baselines. One expects change. The other denies change.

Perhaps DORA-like metrics will help solve some of the problem for software programs as opposed to an acquisition baseline model. The DORA-metrics don’t get to contextual analysis or value, but there’s evidence from the commercial world that DORA metrics correlate with value. This works best for persistent capabilities so you can see trends over time, such as whether change-failure rate or velocity is improving.

Ultimately, the idea I struggle with is providing oversight on specific programs. This perpetuates the stovepiped view of military capabilities. And it forces attention on the largest programs while letting smaller efforts be obfuscated. What is needed is coherent narratives of force structure design and a system for putting all efforts into that context.

Rather than duplicating or reviewing service-level analyses, CAPE analysts could be the “smartest people in the room” who are constantly evaluating specific projects with larger objectives in mind. They could be influencers and connectors rather than gatekeepers to budgets. They would focus on analysis of actuals and their consequences rather than focusing on future program plans. This makes analysis tractable, because focusing on future plans is a matter of opinion.

Rather than having a standardized set of metrics, programs could be tasked with consistently tracking and reporting what matters to them — which should be guided by what is valuable to the users. These non-standard metrics, along with qualitative reports of tests and progress, could all enter a data lake accessible for oversight and able to be piped in as needed for machine learning algorithms to parse through. That use of unstructured data seems like the future. Everyone reporting standard metrics might not convey what is important.

Moreover, oversight to me seems rooted in organizational design and incentives. Knowing who to trust makes a huge difference, and so organizations must be set up such that individuals can establish track records. Rickover is an obvious example of this. A slight different example came with the Air Force’s decision to go with Lockheed’s Skunk Works for the Have Blue (later, F-117) program:

Frankly, I’m not even sure the goddam thing will fly. But if Ben Rich and the Skunk Works says that they can deliver the goods, I think we’d be idiots not to go along with them. General Dixon wholeheartedly agreed with me. And so we started the stealth program on the basis of Ben’s twenty-minute presentation and a hell of a lot of faith in Ben Rich & Company. And that faith was based on long personal experience.

That is why I like the board and committee structure that used to prevail in the defense enterprise into the 1950s (e.g., Navy General Board, DoD R&D Board, Munitions Board). Board members could be made up of senior leaders at the PEOs responsible for getting the work done, and they get to know thoroughly what each other are doing to coordinate action (dual-hatted). PMs may serve on subcommittees for various capabilities, and their functionals may comprise various task forces. This way, you have overlapping neighborhoods of people involved in programs, and people get to know what each other are doing and who to trust. In some ways, this aligns with scaled agile techniques like scrum of scrums.

Such an organizational design for oversight brings programs out of their stovepipes. OSD can focus on curating actuals data put into the data lake. CAPE could be the “wild card” analysts with deep experience analyzing how specific projects contribute objectives. Comptroller makes sure there is a sufficiency of funds and books are in good order. OMB, DoDIG, GAO, and other high level oversight agencies can then debate strategic level decisions and perform investigations of problem areas with an emphasis on value delivery over process conformance.

This is one shot at an alternative framework for oversight. There are positives and negatives. One thing I’m pretty sure about is the need to move the focus from process to value, from universal metrics to contextual metrics, and from money costs to opportunity costs. I’ll leave you with an interesting quote from Bryon Kroger:

When people lack context, we have to lead with control. The new model is freedom and responsibility in a devops world. You lead with context, not control. And we are severely lacking in context.

Chad Millette
May 25, 2021 at 8:20 pm

With regard to the statements: “Clearly, the idea of incremental investment decisions is antithetical to program baselines. One expects change. The other denies change.”…I’ll put on my ‘don’t hate the tool because it is implemented incorrectly’ hat. Having program baselines does not inherently deny change. Project Scope Management and Project Integration Management (PMI knowledge areas) both provide for and allow for the identification, modification, and execution of multiple project baselines as the situation warrants. In practice, the DoD has set the initial baseline in stone and made changing them incredibly painful. Again, that’s the DoD’s implementation of baseline management, not baseline management itself.