Action recognition is a computer vision task that involves identifying and classifying the actions or activities being performed by one or more individuals in a video or sequence of images. This task typically involves analyzing the motion and posture of the human body to determine what action is being performed, such as walking, running, jumping, waving, or sitting.
No need for action recognition in our project.
If your goal is to detect keypoints from a mobile video, you might not necessarily need a model that deals with 3D images. However, there are several advantages to using 3D pose estimation:
To provide 3D information for a model using a mobile camera, you can use several techniques:
Stereo Cameras:
Depth Sensors:
Monocular Depth Estimation:
Use deep learning models trained for monocular depth estimation to infer depth from a single camera image. These models can predict depth information based on learned patterns and context from the image.
<aside> đź“–
Monocular depth estimation is a computer vision technique that predicts the depth information of a scene from a single image (monocular image). Unlike stereo vision, which requires two images from different viewpoints to estimate depth, monocular depth estimation relies on a single image and uses machine learning models to infer depth based on learned patterns and context.
</aside>
Augmented Reality (AR) Frameworks:
Reconstructing the 3D image seems reasonable for the discussed reasons and it seems like their are enough tools for that with different levels of “developer involvement”.
Edge devices are computing devices that are located at the edge of a network, close to the source of data generation. These devices perform data processing and analysis locally, rather than sending all the data to a centralized cloud or data center. Edge devices are used in edge computing, which aims to reduce latency, improve response times, and decrease bandwidth usage by processing data near its source.
This paper/repo has over 2K stars so it seems dependable and its also lightweight which means it could be deployed seamlessly on mobiles.