hey all you people

Announcing Synthetic Datasets for ML Applications

August 9, 2022

Applied Intuition is excited to introduce Synthetic Datasets—high-fidelity synthetic data for machine learning (ML). Synthetic Datasets empower users to improve the robustness of their ML models with camera, lidar, radar, and other sensor data. Users can generate millions of labeled samples with diverse actors, behaviors, and environmental conditions.

Synthetic Datasets: Develop ML Models With High-Fidelity Synthetic Data

Obtaining data for rare events is critical to training robust ML models. However, real-world data collection is time-consuming, costly, and constrained by the frequency with which certain situations occur in the real world. Annotating the collected data is expensive and prone to errors, which incurs further delays and negatively impacts model performance.

Synthetic Datasets enable perception engineers, autonomy leads, and sensor vendors to accelerate the ML development loop, train with dense labels, broaden task domains, bootstrap labels, and increase their team’s efficiency (Figure 1).

Figure 1: Synthetic camera data with raytraced reflections and difficult outdoor lighting conditions. 

Synthetic Datasets include:

  • A library of validated sensor models tuned to represent your sensor hardware; support for camera, lidar, radar, and more
  • Error-free ground truth labels generated programmatically; flexible annotation format to enable integration with your existing pipeline
  • Assets and procedurally generated 3D worlds with support for domain randomization or customization to your task domain
  • High-level dataset definition language and visual editor to easily define the data you need
  • Dataset management tooling to view statistics, filter, and export your data
  • Cloud-first infrastructure enabling rapid dataset generation, elastic scalability, and easy collaboration (Figure 2)
Figure 2: Applied’s Synthetic Datasets allow teams to generate datasets from a high-level language, logs, or scenarios.

The lidar technology company Ouster already works with Applied to accelerate the deployment of their new lidar models. Ouster leverages synthetic data to mitigate overfitting and optimize ML models for new hardware.

“By working with Applied Intuition to provide high-fidelity sensor simulation to our customers, we have greatly simplified the sensor integration process and ultimately accelerated a customer’s time to autonomy,” said Mark Frichtl, CTO at Ouster.

Request a free sample dataset today, or contact our team to learn how Synthetic Datasets can facilitate your team’s ML training efforts and improve model performance.