Human Action Recognition

You are in taxonomy page

Human Action Recognition

This lecture overviews Human Action Recognition (HAR) that has many applications in semantic video content description, indexing, retrieval, video surveillance and Human – Computer Interaction (HCI).  It covers the following topics in detail: Human Action Recognition (HAR) definition and data, Single-view HAR, Multiview HAR 3D HAR Neural HAR, GCN-based HAR and several Human Action Recognition applications, e.g., in assisted living, sports analytics and gait analysis.

Ηuman Body Posture and Pose Estimation

This lecture overviews Ηuman Body Posture and Pose Estimation that has many applications in Human-centered Computing, Image and Video Analysis and Social Media Analytics. It covers the following topics in detail: Human body models: 2D Skeletons, 3D Skeletons, MPEG-4 models. 2D human body posture estimation. 3D human body posture estimation. 2D Human body pose estimation. 3D Human body pose estimation. Facial Pose Estimation.

Facial Expression Recognition

This lecture overviews Facial Expression Recognition that has many applications in Human-centered Computing, Image and Video Analysis and Social Media Analytics. It covers the following topics in detail: Classical Facial Expression Recognition: Grid-Based Methods, Subspace methods. DNN Facial Expression Recognition: DNN Facial Expression Recognition on static images, DNN Facial Expression Recognition on videos. 3D Facial Expression RecognitionFacial Expression Recognition datasets.

Visual Speech Recognition

This lecture overviews Visual Speech Recognition that has many applications in Human-centered Computing, Image and Video Analysis and Social Media Analytics. It covers the following topics in detail: Visual Speech RecognitionVisemes and Phonemes, Face detection, Landmark Localization, Lip readingSpeech reading beyond the lipsAudio-Visual Speech Recognition. Deep Audio-Visual Speech RecognitionConvolutional Neural NetworksRecurrent Neural Networks. Overlapped speech. Speaker targeted AVSR models. Visual Speech Recognition for mobile devices. Visual Speech Recognition DataSetsExperiments on each data set.

Facial Feature Detection

This lecture overviews Facial Feature Detection that has many applications in Human-centered Computing, Image and Video Analysis and Social Media Analytics. It covers the following topics in detail: Face Description Models (FDP/FAP, CANDIDE). Eyes detection. Mouth and lip detection. Eyebrows, nose, chin detection.  DNN Facial Feature Detection. Dynamic 3D face modeling.

Face De-identification for Privacy and Protection

Privacy protection is a very important issue, in the context of social media and GDPR. This lecture overviews the face de-identification problem from an engineering perceptive. In principle, face de-identification methods aim on calculating an affine or a non-linear transformation to an input facial image, so that the depicted person identity is no longer recognized by humans or automated human analysis tools. Traditional applications in the media mainly involve applying additive noise (e.g., pixilation, blurring) or reconstruction-based techniques on the facial image region, achieving sufficient de-identification performance at the expense of corroding image quality. Recently proposed deep learning-based generative methods for face de-identification promise excellent de-identification performance against automated tools while producing visually pleasing yet still not useful images for the human viewers. Finally, adversarial-based face de-identification methods optimally generate the minimum required additive noise that disables automated face detection/recognition systems; thus, the de-identified images maintain maximal utility for human viewers.