Advances in synthetic data and self-driving car safety
The adage “fake it until you make it” is literally holding true with the development of autonomous vehicles (AVs). Historically, vehicles have been physically tested on the road, making sure that they are thoroughly validated for the market. And of course, as these vehicles are being tested for safety, it is expected that they have well-trained drivers behind the wheel at all times, fully controlling them.
As the auto industry transforms and focuses on the development of autonomous vehicles, testing for safety, while no less critical, is only part of the equation. The systems that control AVs first need to be trained. The amount of time, number of miles, and costs associated with training a self-driving car to perceive the objects around it, predict what those objects might do, and respond accordingly is an enormous challenge.
The system that perceives objects around the car is the vehicle’s most critical asset. The perception system is responsible for collecting the vehicle’s sensor data and identifying objects in the vehicle’s surroundings—all in real time. Coincidentally, the perception system is also the most difficult system to train. This is because perception systems are powered by artificial intelligence (AI), and even with today’s best algorithms these AI systems can’t reliably recognize things they haven’t seen thousands of times. Perception systems learn by seeing many examples, and even small variations among those examples affect the systems’ judgment.
So, where do autonomous vehicle developers get thousands of interesting examples to be sure that their perception systems identify it in time, regardless of where it is or what it is? Recent advances in computer graphics have made the creation of synthetic training data a reality.
Examples include fully computer-generated camera, LiDAR, and radar imagery, which is not only used to test how a vehicle would respond in the real world but is also a viable solution for training the complex AI responsible for the autonomous vehicle’s ability to see and understand the world around it. What’s important is that the data created to test the sensors can include a near-endless variety of conditions and can be recreated if a sensor position is changed or a new sensor is added.
To create the countless situations that a vehicle may encounter on the road—both the rare edge cases and the dangerous situations that a small percentage of cars will encounter—a tremendous amount of data is required—that would be impossible to collect manually. In this case, the synthetic data are actually the catalyst to create safety measures and training for the vehicle to handle itself properly in almost any circumstance that could arise.
Developers will surely continue manually collecting data from actual vehicles, but it is costly and painstakingly time consuming. Most importantly, the small datasets would only cover the limited situations that the physical vehicles covered in testing—leaving it insufficiently trained to handle the myriad situations that it could, and most likely will, encounter.
Synthetic data can enable better accommodation for graphic road-marking degradation.
For self-driving cars, training and testing on real data alone can cause real problems by not exposing the vehicle to every conceivable situation that could occur. Contrastingly, AI datasets are machine-generated and can come up with situations that a human driver may not be able to physically create. Further to its scale and diversity, the machine-generated data are accurate and consistent. This reduces the number of iterations necessary for teaching the vehicle’s AI and accelerates the perception systems’ improvement.
With that said, physical driving will continue to play a very important role in the training and testing of self-driving cars and ADAS (advanced driver-assistance system) applications; however, the physical driving will be alongside simulation running countless amounts of synthetic data. As long as these vehicles are sharing the road with humans, humans will want to evaluate and test vehicle behavior in the real world—and rightfully so. Just as long as they’ve proven themselves in the virtual world first.