Driven to safety — it’s time to pool our data

For most Americans, the thought of cars autonomously navigating our streets still feels like a science fiction story. Despite the billions of dollars invested into the industry in recent years, no self-driving car company has proven that its technology is capable of producing mass-market autonomous vehicles in even the near-distant future.

In fact, a recent IIHS investigationidentified significant flaws in assisted driving technology and concluded that in all likelihood “autonomous vehicle[s] that can go anywhere, anytime” will not be market-ready for “quite some time.” The complexity of the problem has even led Uber to potentially spin off their autonomous car unit as a means of soliciting minority investments — in short, the cost of solving this problem is time and billions (if not trillions) of dollars.

Current shortcomings aside, there is a legitimate need for self-driving technology: every year, nearly 1.3 million people die and 2 million people are injured in car crashes. In the U.S. alone, 40,000 people died last year due to car accidents, putting car accident-based deaths in the top 15 leading causes of death in America. GM has determined that the major cause for 94 percent of those car crashes is human error. Independent studies have verified that technological advances such as ridesharing have reduced automotive accidents by removing from our streets drivers who should not be operating vehicles.

The challenge of developing self-driving technology is rooted in replicating the incredibly nuanced cognitive decisions we make every time we get behind the wheel.

We should have every reason to believe that autonomous driving systems — determinant and finely tuned computers always operating at peak performance — will all but eliminate on-road fatalities. The challenge of developing self-driving technology is rooted in replicating the incredibly nuanced cognitive decisions we make every time we get behind the wheel.

Anyone with experience in the artificial intelligence space will tell you that quality and quantity of training data is one of the most important inputs in building real-world-functional AI. This is why today’s large technology companies continue to collect and keep detailed consumer data, despite recent public backlash. From search engines, to social media, to self driving cars, data — in some cases even more than the underlying technology itself — is what drives value in today’s technology companies.

It should be no surprise then that autonomous vehicle companies do not publicly share data, even in instances of deadly crashes. When it comes to autonomous vehicles, the public interest (making safe self-driving cars available as soon as possible) is clearly at odds with corporate interests (making as much money as possible on the technology).

We need to create industry and regulatory environments in which autonomous vehicle companies compete based upon the quality of their technology — not just upon their ability to spend hundreds of millions of dollars to collect and silo as much data as possible (yes, this is how much gathering this data costs). In today’s environment the inverse is true: autonomous car manufacturers are focusing on are gathering as many miles of data as possible, with the intention of feeding more information into their models than their competitors, all the while avoiding working together.

The competition generated from a level data playing field could create tens of thousands of new high-tech jobs.

The siloed petabytes (and soon exabytes) of road data that these companies hoard should be, without giving away trade secrets or information about their models, pooled into a nonprofit consortium, perhaps even a government entity, where every mile driven is shared and audited for quality. By all means, take this data to your private company and consume it, make your models smarter and then provide more road data to the pool to make everyone smarter — and more importantly, increase the pace at which we have truly autonomous vehicles on the road, and their safety once they’re there.

The complexity of this data is diverse, yet public — I am not suggesting that people hand over private, privileged data, but actively pool and combine what the cars are seeing. There’s a reason that many of the autonomous car companies are driving millions of virtual miles — they’re attempting to get as much active driving data as they can. Beyond the fact that they drove those miles, what truly makes that data something that they have to hoard? By sharing these miles, by seeing as much of the world in as much detail as possible, these companies can focus on making smarter, better autonomous vehicles and bring them to market faster.

If you’re reading this and thinking it’s deeply unfair, I encourage you to once again consider 40,000 people are preventably dying every year in America alone. If you are not compelled by the massive life-saving potential of the technology, consider that publicly licenseable self-driving data sets would accelerate innovation by removing a substantial portion of the capital barrier-to-entry in the space and increasing competition.

Though big technology and automotive companies may scoff at the idea of sharing their data, the competition generated from a level data playing field could create tens of thousands of new high-tech jobs. Any government dollar spent on aggregating road data would be considered capitalized as opposed to lost — public data sets can be reused by researchers for AI and cross-disciplinary projects for many years to come.

The most ethical (and most economically sensible) choice is that all data generated by autonomous vehicle companies should be part of a contiguous system built to make for a smarter, safer humanity. We can’t afford to wait any longer.


Source: Tech Crunch

Leave a Reply

Your email address will not be published. Required fields are marked *