Roundup 9 min read

How Top Companies Measure Psychological Safety (With Frameworks You Can Steal)

Google surveys for it. Etsy watches for it in postmortems. Microsoft bakes it into Viva. Here are the exact measurement frameworks these companies use, and how to adapt each one for your team.

By Asa Goldstein, QuestWorks

TL;DR

Most companies say they value psychological safety. Few measure it with any rigor. Google uses pulse surveys and behavioral observation through Project Aristotle. Edmondson's 7-item TPS scale gives you a validated instrument you can deploy this week. Etsy's blameless postmortems turn incident reviews into a live measurement of candor. Microsoft's Viva platform embeds psychological safety signals into daily workflow data. The best approach combines at least two methods: one that captures sentiment and one that captures behavior. Here are the frameworks, the questions, and the blind spots for each.

Only 43% of workers report a positive team climate, according to McKinsey research. That number has barely moved in five years, despite psychological safety becoming one of the most cited concepts in organizational psychology. The problem is not awareness. Every leadership team has seen the Google data. The problem is measurement. You cannot improve what you cannot see, and most organizations have no system for seeing psychological safety at all.

Annual engagement surveys ask about "feeling valued" and "having voice." Those questions are useful for engagement. They are terrible proxies for psychological safety, which is specifically about the willingness to take interpersonal risk: speaking up with a dissenting opinion, admitting a mistake, asking a question that might seem basic. That is a different construct, and it requires different measurement.

Here are four frameworks used by companies that take this seriously. Each one has strengths and blind spots. The smart move is to combine at least two.

1. Edmondson's 7-Item Survey (The Academic Gold Standard)

Amy Edmondson published her Team Psychological Safety scale (TPS-7) in 1999, and it remains the most validated instrument in the field. It consists of seven statements rated on a 7-point Likert scale, from "strongly disagree" to "strongly agree." Total scores range from 7 to 49 (NovoPsych TPS-7 Overview).

The seven items cover ground that most engagement surveys miss entirely:

  • "If you make a mistake on this team, it is often held against you." (reverse-scored)
  • "Members of this team are able to bring up problems and tough issues."
  • "People on this team sometimes reject others for being different." (reverse-scored)
  • "It is safe to take a risk on this team."
  • "It is difficult to ask other members of this team for help." (reverse-scored)
  • "No one on this team would deliberately act in a way that undermines my efforts."
  • "Working with members of this team, my unique skills and talents are valued and utilized."

Why it works: The survey is short enough to deploy quarterly without causing survey fatigue. It measures the actual construct (interpersonal risk tolerance) rather than proxies like "satisfaction" or "engagement." And because it is team-level, you get data that maps to the unit where safety actually lives.

The blind spot: It captures sentiment, not behavior. A team can score 45 out of 49 and still have one dominant voice doing 80% of the talking in meetings. People sometimes report feeling safe because they have never been tested. The survey tells you what people believe. It does not tell you what they do under pressure.

How to steal it: Deploy the 7 items as a standalone pulse survey every 8 to 12 weeks. Track scores by team, not by individual. Share results with the team (not just the manager) and discuss the two lowest-scoring items in a dedicated retrospective. The University of South Carolina assessment framework provides a free implementation guide.

2. Google's Project Aristotle (Survey + Behavioral Observation)

Google studied 180 teams through Project Aristotle and found that psychological safety explained 43% of the variance in team performance, making it the single strongest predictor of effectiveness (Google re:Work). Teams high in psychological safety outperformed others by 27%.

Google's internal approach went beyond a single survey. They used a "gTeams" survey that included questions like "Can you take risks on this team without feeling insecure?" alongside behavioral observations and real-time feedback loops. The survey data was triangulated against team output metrics, meeting participation patterns, and peer feedback.

Why it works: Combining survey data with behavioral observation catches what either method alone would miss. A team that reports high safety but shows low participation in code reviews has a measurement gap. Google's approach closes that gap by looking at what people say AND what they do.

The blind spot: This approach requires infrastructure that most companies do not have. Google built custom tooling on top of its internal systems. The observation component is labor-intensive and difficult to scale without technology support. And the results are only as good as the manager's willingness to act on them.

How to steal it: You do not need Google's scale. Start with Edmondson's 7 items as your survey layer. Then add one behavioral metric: speaking-time distribution in team meetings. Research from MIT's Human Dynamics Lab found that equal distribution of conversational turn-taking is one of the strongest predictors of collective intelligence. If one or two people dominate every meeting, your survey scores are lying to you.

3. Etsy's Blameless Postmortems (Behavioral Measurement Through Process)

Etsy pioneered the blameless postmortem as a standard engineering practice. After significant incidents, the team collectively builds a timeline, extracts lessons, and develops recommendations, all without assigning individual blame. As Etsy's engineering team wrote: "Engineers who think they're going to be reprimanded are disincentivized to give the details necessary to understand the mechanism, pathology, and operation of the failure" (Etsy Code as Craft).

The postmortem process doubles as a live measurement of psychological safety. If team members disclose fully, share context about their decision-making, and speak candidly about what they did not know, safety is present. If they hedge, deflect, or stay silent, it is not.

Why it works: This is behavioral measurement embedded in a process people are already doing. There is no additional survey. There is no additional meeting. You are measuring safety by watching it operate (or fail to operate) under real conditions. Etsy's CEO Chad Dickerson specifically noted that the practice enabled employees to "take more risks and move faster."

The blind spot: Postmortems only happen after incidents. If your team does not ship enough, or does not have incidents, you do not get data. The method also depends heavily on the facilitator's skill. A poorly facilitated postmortem can actually damage safety by creating an environment where the "blameless" label feels performative. Etsy published a Debriefing Facilitation Guide specifically to address this risk.

How to steal it: Extend the blameless postmortem format beyond incidents. Run a "project postmortem" after every sprint or major milestone. Use the same facilitation guide. Track two metrics over time: average number of contributing voices per postmortem (breadth) and the specificity of disclosed mistakes (depth). Both should increase as safety grows.

4. Microsoft Viva (Continuous Signals Through Workflow Data)

Microsoft's approach embeds psychological safety measurement into its Viva employee experience platform. Rather than relying solely on periodic surveys, Viva surfaces signals from daily workflow data: meeting participation patterns, communication network health, and collaboration habits. In April 2025, Microsoft's Viva People Science team presented research on "Building Psychological Safety Amidst Change" with Dr. Julie Morris, positioning psychological safety as a core dimension of the employee experience.

Viva's "voice of the employee" module combines org-wide survey capabilities with questions and action plans based on people science research. The platform benchmarks results against industry standards and uses automated comment analysis to synthesize qualitative feedback into themes.

Why it works: Continuous measurement catches decay in real time rather than waiting for the next quarterly survey. If a team's meeting participation drops sharply after a reorg, the signal surfaces immediately. The integration with daily tools (Teams, Outlook, SharePoint) means measurement happens passively without requiring additional effort from employees.

The blind spot: Workflow signals are proxies, not direct measures. A team might have low meeting participation because they collaborate well async, not because safety is low. The platform also requires the Microsoft 365 ecosystem, which limits applicability for organizations on other stacks. And passive measurement raises legitimate privacy concerns that must be addressed transparently.

How to steal it: Even without Viva, you can track workflow-based safety signals manually. Monitor the ratio of questions asked vs. statements made in your team's Slack or Teams channels. Track how many different people contribute to design documents and code reviews. Look at whether junior team members comment on senior members' pull requests. These are behavioral proxies for safety that require no survey at all.

The Missing Layer: Behavioral Data Under Pressure

All four frameworks above share a common gap. Surveys measure perception. Postmortems measure candor during structured reflection. Workflow data measures collaboration patterns during normal operations. None of them measure how a team actually behaves when the pressure is on, when there is a real disagreement to navigate, a conflict to surface, or a hard call to make in real time.

This is the layer that research identifies as most critical. The 2025 Journal of Business and Psychology study found that psychological safety requires four active processes: connecting, clarifying, supporting, and performing. The "performing" process, executing together under real conditions, is the one that almost every measurement approach misses.

This is where behavioral simulation comes in. QuestWorks runs teams through scenario-based challenges on its own cinematic, voice-controlled platform. Each quest creates real pressure: time constraints, competing priorities, information asymmetry. The platform captures behavioral data during those moments, including who speaks up, how disagreements get resolved, and whether the team adapts or fractures under stress. QuestDash surfaces the patterns for the whole team, with leaders seeing aggregate trends and strengths-based highlights. HeroGPT provides private coaching. Everything is voluntary and never tied to performance reviews.

At $20/user/month with a 14-day free trial, adding a behavioral measurement layer costs less than a single facilitated workshop. The goal is not to replace surveys or postmortems. It is to fill the gap they leave: what happens when the stakes are real and the pressure is on.

Which Framework Should You Start With?

If you measure nothing today, start with Edmondson's 7-item survey. It takes five minutes per person, costs nothing, and gives you a baseline. Deploy it this quarter.

If you already survey but want behavioral data, add blameless postmortems (or extend your existing retros with Etsy's facilitation guide) and track participation breadth and disclosure depth over time.

If you want continuous measurement, layer in workflow signals (meeting balance, review participation, question-to-statement ratios) using whatever tools your team already lives in.

And if you want to measure how your team performs under actual pressure, without waiting for a real crisis to find out, add a behavioral simulation layer.

The companies that sustain psychological safety measure it from multiple angles. A single annual survey is a snapshot. A multi-method approach is a film. You need the film.

Start a 14-day free trial.

Frequently Asked Questions

Amy Edmondson's Team Psychological Safety scale (TPS-7) is a validated 7-question survey scored on a 7-point Likert scale. Questions cover topics like whether mistakes are held against you, whether it is safe to take risks, and whether teammates value your unique skills. Total scores range from 7 to 49, with higher scores indicating stronger psychological safety.

Google used a combination of pulse surveys, behavioral observations, and real-time feedback through its internal gTeams survey. The survey asked questions like "Can you take risks on this team without feeling insecure?" across 180 teams. The finding: psychological safety was the single strongest predictor of team effectiveness.

A blameless postmortem, pioneered by Etsy, is an incident review process where the team collectively builds a timeline, extracts lessons, and develops recommendations without assigning blame. It measures psychological safety indirectly: if people share openly, safety is present. If they hedge, deflect, or stay silent, it is not. The quality of disclosure is the metric.

Annual surveys are too infrequent. Research from the Journal of Business and Psychology (2025) shows psychological safety is perishable and can decay between measurement cycles. Leading companies use quarterly pulse surveys combined with continuous behavioral signals, such as participation patterns in retrospectives and the candor level in postmortems.

Yes. Behavioral observation (who speaks up in meetings, how mistakes are handled publicly), blameless postmortem quality, and team simulation data all provide non-survey signals. QuestWorks captures behavioral patterns during simulated team challenges on its own platform, surfacing data like communication balance and conflict response that surveys cannot see.

Ready to Level Up Your Team?

14-day free trial. Install in under a minute.

Slack icon Try it free
The flight simulator for team dynamics Try QuestWorks Free