吴恩达：How Diffusion Models Work

You Want a neural network to learn what a sprite is:

fine details
general outlines
everything in between

add different noise levels to the training data of sprites, to emphasize the details or outlines. → noising process.

Sampling

sample就是指从一个随机分布获得一个值

step through time实质上是指去噪的过程。从一滴墨水在水中完全扩散的样子回退到刚落入水中时的形态。过程被描述为： original noise → predicted noise. 在这之后通过一种采样算法： DDPM

noise schedule: 控制level of noise apply to image at certain time step.

denoise 的关键步骤：remove predicted noise(就是Model认为不是要生成sprite的噪声)

在diffusion的过程中我们需要迭代地去噪，但是第一次denoise之后就会使生成的数据点偏离正态分布，因此我们每一步都需要添加额外的噪声，添加的噪声scaled by factors determined by noise schedule. 这有助于稳定DNN，使它不至于坍缩成为只会表达训练集的平均值的模型。

UNet

init.conv提取图片信息，经过down sample压缩到一个小layer，然后再经历相同层数的up sample，得到一个大小相同的output。UNet的特点是再up sample的每一步中，我们都可以添加一些额外的信息，这里我们需要将time step作为一个factor传递给模型：

后续也可以添加上context embedding(作用是控制模型生成对应文本描述的内容)

Training

提供一个sprite为其添加一个noise得到sprite_1，将sprite_1输入NN，预测predicted noise，与原noise作比较得到loss

一个更稳定的方案：

采样随机timestep下的nosie level，按照该noise level添加noise

Sampling

UNet

Training

发送评论 编辑评论

发送评论编辑评论