AI Tool Profile

Head Rotation in Denoising Diffusion Models

Denoising Diffusion Models (DDM) are emerging as the cutting-edge technology in the realm of deep generative modeling, challenging the dominance of Generative Adversarial Networks.

Paper and LLMs Denoising Face Generation

Website

github.com

Pricing model

Free

Price start

Free

GitHub Link

The GitHub link is https://github.com/asperti/head-rotation

Introduce

This repository, "Head-Rotation," is linked to the article "Head Rotation in Denoising Diffusion Models." Collaboratively authored, the article addresses challenges in exploring and manipulating the latent space of Denoising Diffusion Models (DDM) for face rotation. The researchers employ an embedding technique for Denoising Diffusion Implicit Models (DDIM) to achieve significant manipulations of face rotation angles, up to ±30°. The method involves computing trajectories through linear regression in the latent space to represent rotations. The CelebA dataset is labeled based on illumination direction, enhancing the accuracy of image selection for the process. The study showcases the intricate relationship between illumination, pose, and rotation. Denoising Diffusion Models (DDM) are emerging as the cutting-edge technology in the realm of deep generative modeling, challenging the dominance of Generative Adversarial Networks.

Content

This is a companion repository to the article "Head Rotation in Denoising Diffusion Models", joint work with Gabriele Colasuonno and Antonio Guerra. In this research, our focus is specifically on face rotation, which is recognized as one of the most complex editing operations. By utilizing a recent embedding technique for Denoising Diffusion Implicit Models (DDIM), we have achieved remarkable manipulations covering a wide rotation angle of up to $pm 30^o$, while preserving the distinct characteristics of each individual. Our methodology involves computing trajectories that approximate clusters of latent representations from dataset samples with various yaw rotations through linear regression. These trajectories are obtained by analyzing subsets of data that share significant attributes with the source image. One of these critical attributes is the light provenance: as a byproduct of our research, we have labeled the CelebA dataset, categorizing images into three major groups based on the illumination direction: left, center, and right. For a fixed direction (left or right), the approach is schematically described in the following picture We prefer to compute centroids instead of directly fitting over all clusters for computational reasons. In the picture below, we summarise the outcome of our labeling and the complex interplay between illumination and orientation by showing the mean faces corresponding to different light sources and poses.

Alternatives & Similar Tools

Generating observation guided ensembles for data assimilation with denoising diffusion probabilistic model Free

This paper presents an ensemble data assimilation method using the pseudo ensembles generated by denoising diffusion probabilistic model.

Visit →

Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction Free

To overcome the above issues, we introduce CycleAdapt, which cyclically adapts two networks: a human mesh reconstruction network (HMRNet) and a human motion denoising network (MDNet), given a test video.

Visit →

Replicate-AI model GFPGAN can help restore old photos Paid

Replicate – Run open-source machine learning models with a cloud API

Visit →

Free Google Gemini: the best largest and most capable AI model Free

Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.

Visit →

LongLLaMA-handle very long text contexts, up to 256,000 tokens Open Source

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.

Visit →

LAMA: Human motion data to realistic complex 3D model actions Open Source

LAMA utilizes a reinforcement learning framework combined with a motion matching algorithm. Reinforcement learning helps the model make appropriate decisions in various scenarios, while motion matching algorithms ensure that synthesized actions match real human actions. In addition, LAMA also utilizes the motion editing framework of manifold learning to cover various possible changes in interactions and operations.

Visit →

Compare Head Rotation in Denoising Diffusion Models

Quick compare routes for nearby alternatives.

All compare routes →

Head Rotation in Denoising Diffusion Models vs Generating observation guided ensembles for data assimilation with denoising diffusion probabilistic model

Compare Head Rotation in Denoising Diffusion Models with Generating observation guided ensembles for data assimilation with denoising diffusion probabilistic model and jump into the preserved compare route.

Open compare route →

Head Rotation in Denoising Diffusion Models vs Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction

Compare Head Rotation in Denoising Diffusion Models with Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction and jump into the preserved compare route.

Open compare route →

Head Rotation in Denoising Diffusion Models vs Replicate-AI model GFPGAN can help restore old photos

Compare Head Rotation in Denoising Diffusion Models with Replicate-AI model GFPGAN can help restore old photos and jump into the preserved compare route.

Open compare route →