Our secure training data annotation platform makes it easy to manage your training data pipeline at scale.
2020 has quickly become the largest work from home experiment in history. And with more people at home, creative solutions that help people stay connected are even more imperative.
Sixty-two percent of employed Americans say they have worked from home during the crisis, and over half say they prefer to continue working remotely as much as possible, once public health restrictions are lifted. With this information top of mind, key players in the augmented and virtual reality space are working to reimagine virtual worlds into virtual office spaces.
Technology roadmaps can be unpredictable for augmented and virtual reality applications, and building virtual experiences for remote workers and shoppers come with its share of challenges.
For example, startups like Spatial are building AR/VR platforms for office collaboration. Its platform allows avatars to join meetings and manipulate digital objects in real time within a virtual world. Big tech companies are also reportedly building headsets in anticipation of the shift to virtual offices, however, communication and collaboration can be strained in a virtual office due to the lack of physical cues from body language and facial expressions.
Another thing to consider is that perceived risk associated with making a purchase sight unseen increases. Retailers like Nike already use AR to bridge the gap between in-person and online shopping experiences, but the opportunity for impulse buys or split-second moments of inspiration may be greatly reduced without the sensory experience of a physical store. Still, AR and VR technology present a promising opportunity for humans and machines to work together to reimagine the world around us.
Here are some common challenges faced with AR and VR application development:
How do you effectively analyze human body language, gestures, facial expressions, etc?
Can you accurately classify objects around you in real time?
How do you accurately understand the location of an object, how much space it takes and the position of the user in relation to an object in real time?
How do you recreate a realistic 3D environment in AR or VR from a 2D camera i.e. space rendering?
Can you eliminate hand controllers to measure hand gestures, for a camera input only experience?
How do you render egocentric movements such as poses and gestures from the perspective of the other user(s)?
When it comes to overcoming AR and VR challenges, quality AI training data matters. 98 percent accuracy with semantic segmentation is needed to even remove the background for an AR application. And, without a precise understanding of motion or accurate perception of the environment, the realism of AR and VR applications is lost, and the user’s experience is greatly impaired. For example, before you can eliminate hand controllers, you need to first understand what your hands and fingers are trying to do i.e. point at something, grab something, wave at someone, etc., and collect data relevant to that use case.
Everything from localization and mapping, the way computers visualize the world, and semantics such as how computers understand the world as we do are all concerns that must be addressed for production level AR and VR. This is where the quality of your training data makes a difference.
Carefully select your data, removing bias and accounting for edge cases in your dataset. Things to consider may include light sensitivity—people should be able to play in the dark as well as bright sunlight or diversity of age, gender, race, etc., to ensure the AR/VR application can detect people of all backgrounds.
Use sensor fusion to capture the environment accurately, acknowledging how different sensors will give you different types of information. For example, if you’re building a game such as Pokemon Go, which needs to be reactive to the context, when Pikachu stands next to a fire, lidar says go, picture says don’t go. The data inputs required to render an indoor space is another example. Lidar will give you information about the position, while images will give you more information about the texture of your objects, similar to autonomous vehicle use cases.
Decide on the format of your fisheye lens data upfront i.e. convert your data to be flat or keep the data distorted. Many open source models don’t work well on objects with a fisheye lens, since the distortion can be too high because you’re very close. Depending on the speed and computing power of your device, flatten the data and analyze it in real time (phones won’t work for this option), or keep it in the fisheye lens format. Your custom labeled training dataset should look like what the algorithm is going to see, and the format of the training data will greatly impact the complexity of your data labeling requirements.
Choose your taxonomy wisely. Create relevant labels based on your use case, keeping in mind the more label types, the more time spent on labelling, and the more potential errors. Also make sure data labeling instructions are clear enough to cover general classes of objects, for instance “animals” vs “cats” or “pigs”, as well as objects that could fall within two classes. For example, a rickshaw could be classified as a bike, motorbike or something else. This is especially important with AR development, since you might encounter unexpected objects in the real world. Clearly defined labeling instructions help ensure an agile response to unexpected categories of objects.
Iterate on your instructions to remove subjectivity. Determining whether an image is high, medium or low quality will depend on which technologies you have access to. Nuances such as hand gestures having a different meaning in different countries is also another example of subjectivity. Your quality rubric and training requirements should cover as many examples as possible, so there’s zero room for your labeling team to misinterpret labeling instructions. A trusted training data partner can help identify edge cases and recommend annotation best practices to improve your initial training guidelines. Digging deep to understand why a result is high quality is a step toward calibrating instructions that are specific, measured and objective.
Define what quality means for your models, based on your goals. There’s no one-size-fits-all approach for all use cases, therefore it’s important to make decisions regarding quality thresholds based on what you’re trying to achieve. More data isn’t always the answer to reaching your training goals, but selecting relevant data, then appropriately filtering out data that’s irrelevant to your use case is key. Adopting a mindset of experimentation can help you overcome augmented and virtual reality development challenges with speed and efficiency.
Access our checklist on how to ensure quality training data for ML models.
The oasis, the metaverse, the mirror world, the AR cloud—whichever name you choose to call the virtual and augmented realities we’ve come to know—has a bright future ahead.
“The AR Cloud is going to become the single most important software infrastructure in the history of computing, far bigger than Facebook’s Social Graph or Google’s Search Index.”
- Ori Inbar, Super Ventures
Beyond games and entertainment, VR is being used for remote work, sports training, education, etc. VR simulations have even proven helpful in the fight against the novel coronavirus, and 30 percent of consumers surveyed would never go to a retail store again if AR allowed them to buy the right size clothing with confidence.
It’s clear that the multi-billion dollar virtual goods industry has great potential, but it will need to overcome challenges like insufficient internet bandwidth and high-costs for hardware, as well as garner greater public interest before the tech can be fully realized.
The promise of 5G means faster network speeds, more reliable connections and greater wireless connectivity to power the future of augmented and virtual reality. With its emergence, we can expect cloud computing so seamless that AR and VR experiences begin to blur the lines between real-life and the digital world. However, the disconnect between platforms and users in terms of how complex these virtual experiences can and should be may be a challenge that even 5G cannot solve.
Key players like Sony, Oculus, Google and Microsoft have the advantage of experience, end-to-end infrastructure and market share to support AR and VR application development, while emerging and established startups search for creative solutions to back-end problems such as localization and mapping, as well as user adoption. Augmented and virtual reality applications that solve a critical use case, leverage strategic partnerships and show inarguable potential for profit and growth have the strongest chance to thrive in the AR / VR market.
Learn how Walmart Labs works with our team to make sure its machine learning model is set up for success.
Maintain the quality and accuracy of your AI training data with data labeling solutions and training data strategy from Sama.