hey all you people

Applied Intuition’s V&V Handbook: Scenario Creation and Test Execution (Part 2)

July 21, 2022

This blog post is the second in a three-part series highlighting different aspects of Applied’s verification and validation (V&V) handbook. Read part 1 for an introduction to V&V and an overview of the best practices that autonomy programs can follow in different stages of their advanced driver-assistance systems (ADAS) and automated driving systems (ADS) development. Part 2 of our series shows how autonomy programs typically approach scenario creation and test execution depending on their development stage and how they can address common challenges. Keep reading to learn more about this topic, or access the full-length V&V handbook below.

Read Applied Intuition’s V&V handbook
Oops! Something went wrong while submitting the form. Please try again.

Scenario Creation

In ADAS and ADS development, a scenario is a description of a scene, including each actor and its behavior over a period of time. The PEGASUS method provides a model for systematically describing scenarios based on six independent layers (Figure 1): Environment topology, traffic infrastructure, environment state, objects and agents, environmental conditions, and digital information.

Figure 1: PEGASUS layers for scenario modeling; not pictured is the sixth layer for digital information (e.g., vehicle-to-everything, digital data/map information)

Scenarios play an essential role in every autonomy program’s V&V efforts. They allow teams to test and evaluate an autonomous system’s performance in specific situations. Scenario-based testing also helps teams systematically build coverage. Coverage measures how much of the operational design domain (ODD) the autonomous system has been tested on so far. The “Defining and measuring coverage” section in our V&V handbook lays out in more detail how autonomy programs in different development stages can define and measure coverage.

The following table shows how autonomy programs typically approach scenario creation depending on their development stage (Figure 2).

Figure 2: Scenario creation by stages of V&V

As seen in Figure 2, early-stage teams usually focus on building broad coverage across requirements and scenario categories. Once they have built broad coverage, later-stage teams focus on collecting and generating edge case scenarios and expanding into new domains.

Building a comprehensive scenario library

Throughout their V&V efforts, autonomy programs need to build a comprehensive scenario library that covers the entire ODD for the intended deployment. Using this library, teams can test their autonomous system against key performance and safety benchmarks for the scenarios that could occur in the ODD. The “Building a comprehensive scenario library” section in the handbook lays out different approaches and techniques that programs can leverage to build out their scenario library.

Defining evaluation criteria and metrics

Based on their scenario library and system requirements, autonomy programs should define evaluation criteria and metrics that test the system’s performance. These evaluation criteria change a scenario into a test case. Autonomy programs should track a measurable, overall pass/fail outcome for each test case. This outcome is a composite of key competency, safety, and comfort factors, where all non-optional evaluation rules must pass, with the ability to dig into each of them and their underlying metrics. The V&V handbook’s “Defining evaluation criteria and metrics” section lists specific metrics and evaluation criteria that teams should assess for their test cases.

Test Execution

The following table lays out which test methods autonomy programs typically use at each stage in their development and what role real-world tests play at each stage (Figure 3). 

Figure 3: Test methods by stages of V&V

Autonomy teams can prevent scaling and cost issues by ramping up simulation usage as soon as possible. It can also be beneficial to transition vehicle tests to focus less on core testing and more on final validation and edge case discovery. The “Test execution” section in our handbook explains how autonomy programs can use each test method effectively depending on the team’s development stage while considering each method’s strengths and weaknesses.

Combating combinatorial explosion in scenario-based testing

One of the main challenges of test execution is the problem of combinatorial explosion. Autonomy programs must bias their resources towards safety-critical scenarios, as those provide the most information to validation, safety, and development teams. However, scenario libraries continually increase in size as the overall testing program matures. The number of scenarios teams need to test usually increases linearly relative to the number of new requirements. The volume of the scenario space and the total number of scenarios teams need to execute increases exponentially with the number of ODD attributes and parameters they need to cover. 

For example, an autonomy program might need to test 1.6 million variations to exhaustively test all the possible permutations of a specific test case in (Figure 4). This example does not include different environmental conditions (e.g., time of day, rainfall), map locations, and higher granularity of behavioral parameters that would exponentially increase the required number of tests even further. On top of that, these 1.6 million variations only pertain to a single test case, while autonomy programs need to run thousands of test cases for each software release.

Figure 4: Example cut-in scenario; exhaustively testing a handful of values for each test parameter would require 1.6 million variations for a single test case

Applied recommends a scalable simulation-first testing strategy. Unfortunately, even with simulation, teams might still need to execute hundreds of millions of scenarios in each release. To supplement their scalable simulation strategy, autonomy programs should leverage intelligent sampling techniques to identify the important scenarios to spend testing resources on. Depending on their development stage, teams should optimize for one of the following things: 1) Testing for coverage and gaining new information about the ODD; 2) finding safety-critical scenarios to drive development forward (Figure 5).

Figure 5: Recommended optimization goals of intelligent scenario-based testing methods by V&V stage

The handbook’s “Combating combinatorial explosion in scenario-based testing” section lists different techniques that autonomy programs can leverage to speed up their testing, development, and information gathering to combat combinatorial explosion.


Autonomy programs in all development stages can leverage best practices for scenario creation and test execution to advance their V&V efforts. Programs should build a comprehensive scenario library, define evaluation criteria and metrics to turn scenarios into test cases, leverage different test methods effectively, and use simulation and intelligent sampling techniques.

Applied Intuition’s V&V handbook discusses these and many other topics in more detail. Download the full-length handbook today, and stay tuned for part 3 of our blog post series, which will explore how autonomy programs can define and measure coverage and analyze their system’s performance.

Download the full V&V handbook
Oops! Something went wrong while submitting the form. Please try again.

Contact our engineering team if you have questions about this handbook or would like to learn more about Applied’s V&V platform Basis.