This Avatar patent allows me to see the integration of Apple's ecology and AR/VR headsets in the future

Apple's research and development of AR/VR headsets is almost an open secret, especially from a series of patents applied by the company in recent years, we can see various AR/VR related technologies, covering UI, UX, hardware, optics , data transmission, sensors and many other aspects. For example, not long ago, the USPTO published a 3D Avatar-related patent from Apple, which described a low-cost 3D image scanning solution that can use XR headsets, mobile phones, tablet computers, and computer cameras to scan the head and hands. , torso and other different body parts, the generated virtual avatar actions are controlled by the user, and can move in 3D scenes such as AR/VR.

 

This patent is quite interesting, it envisages the interface and process of pairing mobile devices with AR/VR headsets to generate 3D avatars. If Apple does release AR/VR headsets in the future, the solution described in this patent is practical enough, and it seems that it can be realized using existing technology. Of course, we know that usually when large companies apply for patents, they do not necessarily apply the technologies in the patents. They may also be for the purpose of preventing competition, enriching technology accumulation, and so on. However, the possibility of practical application of some patent contents is not ruled out. By analyzing patents, we can somewhat understand Apple's exploration of related technologies, as well as potential future development directions and so on.

About Avatar generation

It is understood that the title of this patent is "Interface for expressing Avatar in a 3D environment", which mainly describes how to generate 3D Avatar and how to use Avatar to interact with XR scenes. The patent involves a lot of details related to the whole body Avatar, such as scanning the user's face with the camera of the mobile device (similar to the Face ID function), and tracking the user's hands, feet, torso, arms, shoulders, etc. with the multi-camera system of the XR headset. body parts.

The specific usage process is as follows:

1) Face scanning interface: The user needs to remove the glasses, keep the head still, and then rotate the camera of the mobile device around the head. This function does not seem to require 360° head modeling, so the user can scan the sides and front of the face by hand without the assistance of others.

Interestingly, the head scanning mechanism described in the patent is somewhat similar to fingerprint entry on a mobile phone, requiring multiple head captures to complete head modeling. In addition, you also need to scan various expressions, such as smiling, opening your mouth, and so on.

2) After the facial scan is completed, you can set Avatar parameters, such as height, frame and other accessories.

3) After that, you need to put on the AR/VR headset to scan your hands.

Currently, Meta Reality Labs is also exploring a lightweight 3D facial capture solution based on mobile devices. Judging from the previously exposed Codec Avatar research progress, Meta can already use the iPhone 12's front-facing lens for high-fidelity 3D facial capture and reconstruction. , and can also synthesize a new 3D perspective and expression, the effect is good enough. The solution is based on the Face ID camera module of the iPhone 12. Face ID is one of the most advanced mobile 3D face scanning solutions on the market, which is enough to assist AR/VR headsets in face tracking and capture. If you use the LiDAR sensor equipped with some iPhones to scan the face, the effect will be more accurate.

Judging from previous predictions, the Apple AR/VR headset will be equipped with multiple sets of 3D sensors for tracking changes in eyeballs and facial expressions, and used to control animated virtual heads such as Animoji. If combined with the face scanning of the iPhone, Apple AR/VR users will be able to naturally generate their own 3D images and use them in AR/VR. The Apple product ecosystem can easily achieve such a function.

As for the gesture recognition part, Quest can already track hand nodes through external cameras and computer vision algorithms. In the future, Apple AR/VR headsets should also have a built-in gesture recognition module.

About interaction

After completing the 3D facial scan, the user will play the role of Avatar, perceive and interact with the physical environment through vision, touch, hearing, taste and smell in the XR scene. In the AR mode, the system can generate virtual objects (such as trees, buildings) and integrate them with the physical environment, and the system can render light and shadow matching the ambient light for the virtual objects. Even, objects in the physical environment can be replicated through sensors, and the replicated virtual objects can have similar shapes or colors.

Apple announced a photogrammetry-based 3D scanning tool at WWDC last year, which can scan any object into a 3D model, which is equivalent to using digital technology to make a high-fidelity "copy" of an actual object. The scanned 3D model is in USDZ file format, which can be directly embedded into a web page for preview, viewed in the form of AR, and shared with others through iMessage. Although the tool has not yet landed on the C-end, the demo video shown before looks amazing, and the 3D model looks quite close to the real thing.

In addition to using motion to interact with the XR environment, users can also interact through sight, hearing, touch, taste and smell, or issue voice commands to adjust the characteristics of virtual objects.

In some XR scenarios, users can only hear and interact with audio. For example, XR can recognize the user's head rotation and adjust the spatial audio and visual effects in real time to restore the characteristics of sound and light in the real space. The patent also states that the audio in the XR can support "transparency mode," which selectively blends ambient sounds with computer-generated audio. We know that Apple AirPods Pro/Max headphones support "transparency mode", which can actively reduce environmental noise when listening to songs. If applied to XR devices in the future, it will allow users to maintain interaction with the surrounding people and the environment.

Interestingly, you can use audio to represent yourself in XR, or choose a visual Avatar, probably like you can choose voice or video calls in WeChat.

In addition, it can be seen from Apple’s previous patents that it is accustomed to defining mixed reality, virtual reality and extended reality (XR) separately in patents, where XR refers to a partially or fully simulated environment that people can perceive through electronic systems. environment and interact with it. In the XR environment, the user's physical movement is tracked in real time and represented in XR, and the XR environment also responds to the user's actions to simulate physical interaction.

The XR experience may come from a variety of image-generating components, such as headsets, displays, projectors, touch screens, and more. It may also be equipped with multiple sensors, including image sensors, light sensors, depth sensors, tactile sensors, orientation sensors, temperature sensors, position sensors, motion sensors, speed sensors, XR map generation units, and more.

About the headset hardware

Apple pointed out in the patent background: In recent years, the development of AR/VR computing systems has increased significantly, and there are various ways to interact with AR/VR content, such as camera gesture recognition, handles, joysticks, touch surfaces, touch screens, etc. wait. Through these interaction methods, users can control objects such as AR images, AR videos, AR texts, and AR icons.

Apple believes that some of the current AR/VR interaction methods are cumbersome, inefficient, and have limited feedback. For example, compared to gesture interaction, the handle is not friendly enough for beginners, and it has a certain weight, which is easy to break the sense of immersion. In addition, if you want to track full-body movements, you may need a full-body sensory suit, or multiple tracking modules, which are complicated and cumbersome to operate. These input methods require a large amount of calculation and consume more power, which is not friendly to AR/VR all-in-one machines that need battery power.

Therefore, a more effective, intuitive and easy-to-understand human-computer interaction method is needed.

In terms of hardware, Apple’s Avatar patent solution requires some kind of image generation computer system, and one or more input devices with computing generation capabilities, such as virtual reality, mixed reality display devices, or desktop computers, mobile devices (mobile phones, notebooks, tablets) computers, handheld devices), wearable electronics (smart watches).

Its hardware devices may be equipped with touchpads, camera groups, touch screens, eye tracking modules, and gesture tracking modules. In addition, it can be equipped with motion sensor and audio accessories. In addition, a graphical user interface (GUI) may be used to support stylus, fingertip input, touch and gesture input, eyeball input, and voice input. Users can also interact with the GUI through full-body gestures, which are captured by sensors such as cameras.

Users can draw, edit pictures, demonstrate, word process, make icons, play games, make phone calls, hold video conferences, send emails, send messages, exercise, take photos, shoot videos, browse web pages, and listen to audio in GUI through these interactive methods. Music, taking notes, watching videos, and more. References: PatentlyApple , USPTO

Guess you like

Origin blog.csdn.net/qingtingwang/article/details/127450578