As the development of autonomous cars continues, the challenges around how data from those vehicles is managed needs to be addressed, according to Dell Technologies’ Florian Baumann.
There’s a lot of buzz around the development of autonomous cars, from discussions about the software that goes into them to the time it will take to have fully autonomous vehicles on the road.
However, an area less commonly discussed in relation to autonomous vehicles is the data involved in autonomous cars. The sheer amount of data storage they require highlights questions around how that data will be safely managed, held and transferred when self-driving cars start appearing on our roads.
Florian Baumann is the global CTO for automotive and AI in Dell Technologies. He spoke to Siliconrepublic.com about the data challenges the autonomous vehicle industry is facing, starting with the growing level of data storage required.
‘A vehicle today is a data centre on the road’
– FLORIAN BAUMANN
Vehicles fall into five levels of automation. The more autonomous the vehicle, the more data storage it will require. So, for example, a level-two car is one that is operated by a human at all times but has additional automation systems such as lane-change assist, blind-spot detection or self-parking features. According to Baumann, a level-two car requires between four and 10 petabytes of data.
However, a level-three car requires between 50 and 100 petabytes of data storage, and a level-five cars require three or more exabytes of storage. “A level-three system means the car can drive autonomously [with conditions]. If we look at level five, the car can drive fully autonomously,” he said.
What happens to all the data?
While the amount of data storage required in autonomous vehicles is huge, the challenges don’t come at the storage stage but at the transfer stage. For example, when sending vehicles out to the field to record data from cameras, laser scanners and radars, Baumann said each of those cars can produce up to 80 terabytes of data per day.
“Then you have to move the data physically from the vehicle into your data centre and you can do this, for example, through connecting a cable to the vehicle and then you have to copy the data from your R&D centre to the centralised data lake,” he said.
“Typically, our customers have one centralised data lake per continent and this could be done virtually through accelerated file transfer methods or physically.”
According to Baumann, two major challenges within the automotive industry in relation to data is the security of the data, especially in relation to data transfer, and the need for 5G.
5G infrastructure and data
Baumann said 5G will be crucial for the development and production of self-driving cars, especially in the next five to 10 years when there will be much more technology integrated into the vehicle.
“A vehicle today is a data centre on the road,” he said. “So, you have to pre-process the data in the vehicle, for example, to identify valuable data that is worth sending over 5G to the data centre.”
He also said self-driving cars will need to store data on the edge, meaning inside the vehicle. “You need computing storage on the edge, especially if you don’t have 5G coverage everywhere where you’re operating your car. This is another issue because you have to cache the data in the vehicle to send it over after you have full 5G coverage.”
Another issue with 5G is the upload speed. “5G was made for streaming data for big download speeds,” Baumann said. “The upload speed is not that high. So, you cannot really upload huge amounts of data through 5G. It was made for streaming data from the data centre to the end user and not from a vehicle to a data centre. But 5G has advantages because of its low latency.”
The challenges with centralised data
As with all conversations around data, security is a vital consideration. Baumann said that with autonomous driving, there are a lot of start-ups and open-source techniques that are being used to add additional security layers, such as encrypting the data or using VPN connections.
However, he added that it’s very important for big automotive companies and start-ups to access data independently from the location of the data scientists.
“We have a development team working in the Bay Area in the US, another development team is working out of the Michigan area on the east coast, then we have a development team in Germany, another one in Korea, in China, and all these guys must access the data and then the data is located, for example, in a data centre in Amsterdam. So how can you access and locate data remotely?”
He said the real challenge is figuring out how to create a global system that enables data scientists to locate and use data, while also being compliant with GDPR. “It goes hand in hand with geo-distributed machine learning and federated learning,” he said.
“The goal is the data stays where it was born and developers can start a global machine learning job without copying the data to a centralised data centre.”
Covid-19 and the autonomous driving industry
Baumann also spoke about the autonomous driving industry as a whole and how it has been affected by the ongoing Covid-19 crisis. He said that while some strategic projects are being frozen, the industry as a whole is not really slowing down.
“Customers are still investing into storage networks, computing, because these projects will come. And if they stop the development now, they will lose a huge piece from the cake,” he said.
“In the long term, [Covid-19] will be a trigger to digitalise all these factories more and more so companies are trying to avoid humans working in the productions and they are trying to automate everything and digitalise everything.
“[Coronavirus] will not be the last crisis and if production can fully work autonomously, you don’t have any gaps.”