TrilightLabs - Think Different

记录我在尝试使用Flux换脸工作流遇到的问题

这个问题搞到最后把我自己逗乐了，我暂且还原下我在配置这个工作流环境遇到的问题。

这个工作流来自civitai 老外分享的一个换脸工作流，我将其导入本地comfyui之后照常安装缺失节点，包括如下部分：

怎么会这么多节点要安装？首先我导入这个工作流默认只提示缺失三个节点，GGUF和everywhere以及reactor这些。

安装好重启出现如下报错：

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\nodes.py", line 1993, in load_custom_node
    module_spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\comfyui_face_parsing\__init__.py", line 18, in <module>
    download_url("https://huggingface.co/jonathandinu/face-parsing/resolve/main/config.json?download=true", face_parsing_path, "config.json")
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torchvision\datasets\utils.py", line 134, in download_url
    url = _get_redirect_url(url, max_hops=max_redirect_hops)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torchvision\datasets\utils.py", line 82, in _get_redirect_url
    with urllib.request.urlopen(urllib.request.Request(url, headers=headers)) as response:
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 519, in open
    response = self._open(req, data)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 496, in _call_chain
    result = func(*args)
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\urllib\request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。>

Cannot import D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\comfyui_face_parsing module for custom nodes: <urlopen error [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。>
FizzleDorf Custom Nodes: Loaded
FaceDetailer: Model directory already exists
FaceDetailer: Model doesnt exist
FaceDetailer: Downloading model

说明face_parsing节点是安装了，但是facedetailer依赖的模型下载失败，GPT指引我去这个插件文件夹下找到nodes.py 文件，看它需要下载哪些模型，以及模型的下载地址都是哪些，代码如下：

ef get_restorers():
    models_path = os.path.join(models_dir, "facerestore_models/*")
    models = glob.glob(models_path)
    models = [x for x in models if (x.endswith(".pth") or x.endswith(".onnx"))]
    if len(models) == 0:
        fr_urls = [
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GFPGANv1.3.pth",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GFPGANv1.4.pth",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/codeformer-v0.1.0.pth",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GPEN-BFR-512.onnx",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GPEN-BFR-1024.onnx",
            "https://huggingface.co/datasets/Gourieff/ReActor/resolve/main/models/facerestore_models/GPEN-BFR-2048.onnx",
        ]
        for model_url in fr_urls:
            model_name = os.path.basename(model_url)
            model_path = os.path.join(dir_facerestore_models, model_name)
            download(model_url, model_path, model_name)
        models = glob.glob(models_path)
        models = [x for x in models if (x.endswith(".pth") or x.endswith(".onnx"))]
    return models

好家伙，地址都对，也能手动下载模型放到models/facerestore_models即可。但是偏偏后台下载提示无响应。手动下载重启节点后问题消失。

之后重磅的来了，提示找不到 AV_Facedetailer 节点，这就让我很纳闷折腾了2个小时，为什么呢？谷歌找不到一个唯一匹配的答案是这个但偏偏又没有下载链接，github上空空如也。所以你就知道为啥我第一张图拉了一个那么长的清单安装的节点了吧？我一直以为是impact-pack 依赖项节点是它，一顿操作下来原地杵。

继续回到civitai去看评论，果然大家都遇到这个问题。

如图：

AV_Facedetailer 这个节点命名和art-venture相差也太远了吧！要没这个大哥发出来天王老子都找不到这个节点啊。事实上3个多小时，总算把这个流程顺利安装好了。

嗯，如果有遇到类似的朋友，记得看评论，记得看说明书，太操蛋了。先做个记录，换脸效果后续补上，因为我发现GGUF这个节点需要依赖预训练好的模型。

9月28日更新换脸工作流节点的问题：

今天像实际跑一下流程看是否跑通，上传照片之后出现如下报错：

执行 KSampler 时发生错误：cast_to() 得到了一个意外的关键字参数“copy” 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 317 行，在执行 output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, executive_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 192 行，在 get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, executive_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 169 行, 在 _map_node_over_list process_inputs(input_dict, i) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\execution.py”, 第 158 行, 在 process_inputs results.append(getattr(obj, func)(**inputs)) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\nodes.py”, 第 1429 行, 在 sample return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\nodes.py”, 第 1396 行, 在 common_ksampler samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, File "D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample return original_sample(*args, **kwargs) # 此代码有助于解释异常中发生的错误消息，但不会对其他操作产生任何影响。文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py”，第 420 行，在 motion_sample 中返回 orig_comfy_sample(model, noise, *args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\sample.py”，第 43 行，在样本中 samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 829 行，在样本中返回样本（self.model、noise、positive、negative、cfg、self.device、sampler、sigmas、self.model_options、latent_image=latent_image、denoise_mask=denoise_mask、callback=callback、disable_pbar=disable_pbar、seed=seed）文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 729 行，在样本中返回 cfg_guider.sample（noise、latent_image、sampler、sigmas、denoise_mask、callback、disable_pbar、seed）文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 716 行，在样本中输出 = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”，第 695 行，在 inner_sample 中，samples = sampler.sample(self,sigmas、extra_args、callback、noise、latent_image、denoise_mask、disable_pbar) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 600 行, 在样本中 samples = self.sampler_function(model_k、noise、sigmas、extra_args=extra_args、callback=k_callback、disable=disable_pbar、**self.extra_options) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\utils\_contextlib.py”, 第 115 行, 在 decorate_context return func(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\k_diffusion\sampling.py”, 第 144 行, 在sample_euler denoised = model(x, sigma_hat * s_in, **extra_args) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 299 行, 在 __call__ out = self.inner_model(x, sigma, model_options=model_options, seed=seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 682 行, 在 __call__ return self.predict_noise(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 685 行, 在 predict_noise return samples_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 279 行, 在 samples_function 中 out = calc_cond_batch(model, conds, x, timestep, model_options) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\samplers.py”, 第 228 行, 在 calc_cond_batch 中 output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\model_base.py”, 第 142 行, 在 apply_model 中 model_output = self.diffusion_model(xc, t, context=context，control=control，transformer_options=transformer_options，**extra_conds).float() 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1518 行，在 _wrapped_call_impl 中返回 self._call_impl(*args，**kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1527 行，在 _call_impl 中返回 forward_call(*args，**kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\model.py”，第 159 行，在 forward out = self.forward_orig(img， img_ids、context、txt_ids、timestep、y、guided、control）文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\model.py”，第 118 行，在 forward_orig img 中，txt = block(img=img, txt=txt, vec=vec, pe=pe) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1518 行，在 _wrapped_call_impl 中返回 self._call_impl(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”，第 1527 行，在 _call_impl 中返回forward_call(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\layers.py”，第 148 行，向前 img_mod1，img_mod2 = self.img_mod(vec) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", 第 1518 行，在 _wrapped_call_impl 中返回 self._call_impl(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py", 第 1527 行，在 _call_impl 中返回 forward_call(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ldm\flux\layers.py", 第 110 行，在 forward out = self.lin(nn. functional.silu(vec))[:, None, :].chunk(self.multiplier, dim=-1) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”, 第 1518 行, 在 _wrapped_call_impl 中返回 self._call_impl(*args, **kwargs) 文件 “D:\ComfyUI-aki\ComfyUI-aki-v1.3\python\lib\site-packages\torch\nn\modules\module.py”, 第 1527 行, 在 _call_impl 中返回 forward_call(*args, **kwargs) 文件 “D:\ComfyUI-aki\ComfyUI-aki-v1.3\comfy\ops.py”, 第 67 行, 在 forward 中返回 self.forward_comfy_cast_weights(*args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-GGUF\ops.py”, 第 152 行, 在 forward_comfy_cast_weights 中返回 self.forward_ggml_cast_weights(input, *args, **kwargs) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-GGUF\ops.py”, 第 174 行, 在 forward_ggml_cast_weights 中 weight, bias = self.cast_bias_weight(input) 文件“D:\ComfyUI-aki\ComfyUI-aki-v1.3\custom_nodes\ComfyUI-GGUF\ops.py”, 第 144 行, 在 cast_bias_weight 中 bias = comfy.ops.cast_to(bias, bias_dtype, device, non_blocking=non_blocking,复制=False)在 cast_bias_weight bias = comfy.ops.cast_to(bias、bias_dtype、device、non_blocking=non_blocking、copy=False) 中在 cast_bias_weight bias = comfy.ops.cast_to(bias、bias_dtype、device、non_blocking=non_blocking、copy=False) 中

很显然，这个报错出现GGUF和Animatediff插件之间的兼容性问题，我发现本地的Animatediff插件是没有模型依赖的，猜测是没模型导致报错？所以下载了两个模型一个ckpt一个lora，但是没有解决，依然报以上错误。

查询GPT找答案：

将以上报错发过去，补充了如下问题：我看别人反馈这种报错是animatediff插件导致的链接地址：https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved，说什么不是最新版本我升级的都是新版本，依然是这个报错

得到答案：

您遇到的错误cast_to() got an unexpected keyword argument 'copy'确实与 ComfyUI 的最新版本与插件之间的兼容性问题有关AnimateDiff-Evolved。

看来根本原因是 ComfyUI 更新与AnimateDiff-Evolved插件不匹配。最新的 ComfyUI 更新引入了某些自定义节点的旧版本不完全支持的更改，包括它们处理参数的方式，例如copy在以下函数中（cast_to()

GitHub) (

GitHub）。

要解决此问题，请执行以下操作：

更新 AnimateDiff-Evolved：确保您正在运行AnimateDiff-Evolved插件的最新版本。插件的开发人员 Kosinkadink 提到，需要进行小规模重构才能使插件与 ComfyUI 中的最新更改兼容。更新主 ComfyUI 和AnimateDiff-Evolved应该可以解决问题 (GitHub）。

检查节点依赖关系：如果更新后问题仍然存在，请检查任何节点或采样器特定的依赖关系（例如ComfyUI-GGUF）可能也需要更新，因为某些节点可能仍引用已弃用的参数，例如（copyGitHub）。

遵循这些步骤应该有助于消除错误并使插件顺利运行。

所以这时候，更新了comfyui的主程序版本到最新，报错问题解决！

ComfyUI-Flux-支持文本直出-模型跑图感受分享

我预计 Flux 模型将会是在视觉模型中长期霸榜，生成风格的多样性、对人物肢体的控制，都是其他模型无法比拟的。接下来我将展示2个流程来具体看看它的生成质量。

案例一迪斯尼风格电影海报

首先，我用这张海报来进行文本反推来获取海报的关键词，使用工具：https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha

获得关键词如下：This is a digital promotional poster for the Disney+ animated film “The Ice Age Adventures of Buck Wild.” The image is a vibrant, colorful cartoon depiction set in a lush, jungle-like environment. The background features dense foliage, tall trees with broad leaves, and a variety of greenery, creating a sense of depth and immersion.

In the foreground, two anthropomorphic ground sloths, Buck and Crash, are prominently featured. They are standing on a large, gnarled tree branch, with Buck on the left and Crash on the right. Buck is holding a stick in his right hand and has a playful expression, while Crash is smiling and has his arms outstretched, as if excited. Both characters have light brown fur with darker brown stripes, and their eyes are large and expressive.

The title “The Ice Age Adventures of Buck Wild” is prominently displayed in large, bold, yellow letters in the center of the poster. Above the title, the text “Disney+ + gets wild” is written in white. Below the title, the Disney+ logo is visible, along with the phrase “Original movie from 20th Century Studios.” The poster’s overall style is bright and cheerful, with a playful, adventurous tone.

别小看上面这个反推工具，目前来说使用体验最好的，对图像的识别能力非常强。有兴趣可以制作成插件。跑题了，来看看我的工作流。

我使用的是flux_bnb_nf4_v2的checkpoint，直接简单的文生图流程，生成的效果如下：

我将海报中的两个角色换成猫和狗，海报输出的文字进行了修改，得到上图的效果。

案例二 3D 风格人像

依然是网上搜集的一张海报，赛博风格

依然用反推提示词，这里要注意。可能是这个反推文本模型存在一定的局限，这类人物角色它描述成二次元的风格。所以呢，我对此进行了适当修改，让他具备3d，blender技术效果。

提示词如下：This image is a digital illustration, likely created in a comic book style, featuring a futuristic, cyberpunk aesthetic. The central figure is a young woman with pale blue skin and striking, large, orange eyes. Her hair is platinum blonde and styled in a sleek, high ponytail. She is dressed in a high-tech, form-fitting outfit with metallic accents, giving her a futuristic, robotic appearance. Her left hand, which is gloved in a black, mechanical-looking glove, is holding a clear glass filled with a refreshing drink, which she is sipping through a straw.

The background is predominantly black, with vibrant yellow and orange accents, creating a striking contrast that highlights the central figure. The magazine cover title, “FAVR,” is prominently displayed in large, bold letters at the top, with additional Japanese text on the left side. The word “SMOOTHIE” is written in bold, white letters at the bottom, emphasizing the theme of the cover. The overall color palette is a mix of cool blues and warm oranges, contributing to the high-tech, futuristic vibe of the artwork. The image is detailed, with a focus on the woman’s expressive face and the sleek, futuristic design of her outfit.

生成效果如下：

你可能会说，哟这手指怎么还是画不好？你别忽略了一问题，要想画好手指需要搭配flux版的真实lora模型一块使用。（我为了水一篇文章，懒得再复现）

案例三未来科幻风人像

参考图像

反推提示词，同样的方法，同样的操作，不再赘述

配置flux环境需要相应的节点支持

controlnet相关：https://github.com/XLabs-AI/x-flux-comfyui

节点安装到custom_nodes下

工作流可参考：https://github.com/ZHO-ZHO-ZHO/ComfyUI-Workflows-ZHO

本地部署相关扩展阅读：https://www.freedidi.com/13266.html

工具很好，真正的价值是使用者，在实际工作流中去解决问题

游戏图标风格模型的训练集筛选思路

定义：游戏图标风格模型包含的图标属性类目多，杂。训练集各个属性数量占比不均衡，在游戏中往往宝箱、宝石、药水瓶、卡券占多数，所以这类风格模型在工作流的运用上通常是用来做风格迁移。

因为训练素材各个占比数量不同，也就导致造型上数量多的在训练轮数增加会被固化，很难通过补充形状的提示词来做造型上的改变。而恰恰因为某些训练素材占比少，轮数增加之后反而能够获得稳定的效果。结论就是风格模型 LoRa 没有绝对的稳定的单一模型，需要依据需求使用不同轮数的模型来做生产。

关于打标

如果是训练风格模型，没必要给各个素材打统一的触发词，应该是将每个素材标按：属性+特征描述例如：gift box,flower,ribbons,bow,

如果是训练专属某个属性的图标模型，例如头像框则要分析当前头像框的素材造型特征。头像框算是比较简单的造型了，大体上分为对称设计和不对称设计两种。所以在打标的时候要加入 symmetry、asymmetrical 两个标签，扯远了……

补充训练集

一个游戏项目的图标资产数量上肯定是有限的，甚至某些数量还不满足训练要求。这时候就要有目的性地去从外部补充素材，筛选风格相似、写实度大体接近的进行补充。

例如上图的书籍图标样式大体类似，通过补充一些其他造型、角度不同的书籍图标来让它更具泛化性，从而满足需求。

例如礼盒图标也是因为训练素材太过单一的缘故，细节造型上缺少一些变化。为了扩充礼盒的泛化能力有目的性地去补充了相关训练素材。

下图是补充后的

例如宝石，原来的宝石也是缺少一些变化，造型上太单一，缺设计感。

补充的素材之后

风格模型的价值在于能够涵盖泛的图标造型，同时继承游戏项目的风格，造型上不追求精确，具备可修改的空间，满足在图标的批量生产中的风格迁移即可。

关于生成的图标出现污染的问题

如图水果冰激凌会出现叶子和面包等类型的元素，是因为打标的时候加入了food这个标签的影响，所以迭代模型的时候可以将相应的素材打标要清理干净，虽然都是食物，但要精细化去打标。

没有被打标污染的食物图标

以上图标基础模型：revAnimated_v122EOL 训练而来，训练集数量674

[DALL·E 3]算是最易用的自然语言绘画

用DALL·E 3绘制的微信表情包小肥狗墩墩

以上就是用DALL·E 3绘制的表情包，相比midjourney反复垫图生成确实高效很多。自然到只需要用中文叫它如何做。

下面是案例对话还原：

如果不满意，还可以直接发送参考图给它，比如我想要个吃瓜表情

这样就不需要自己再组织语言让他理解是怎样一个构图，得益于它的识图能力强大

我想要个让小肥狗从手工袋探出头的表情，效果不是很满意。直接给它一张参考图生成后的。

与midjourney交互上的区别；

1.使用中文描述，每次生成都是重新请求生成，连贯性会稍逊于Midjourney，但是一旦出现第二次生成的图风格大相径庭，可以要求“和第XX张图风格类似”就可以让图的风格统一起来。

2.生成不需要任何指令，如果能够描述的更具体，效果也就越符合你的期望，避免抽象词汇。

3.Midjourney垫图只模仿它的形。DALL·E 3则能够借助参考借鉴构图，又能清楚所绘制的物体是什么。

柔柔酱表情包-Midjourney制作

使用midjourney来生成表情包需要特定的提示词组合，

–style cute –q 2 –s 750

样例：一个超级可爱的黄色小狐狸，快乐、悲伤、无助、喜欢、愤怒、冷漠，9个图标，stick figure,纯白色背景 –style cute –q 2 –s 750 — V5.1

注意增加了stick figure（简笔画风格）以及使用v5.1的模型，在这个过程中要反复筛选不断升频操作，找到合适的表情，然后做微调。

以下是生成的原图，还有很多就不放了，导入到PS中找到合适的再做放大细节细节处理，（使用Upscayl这款软件）

总体来说，操作上相比DALL·E3会繁琐一点，随机性也会更大。好处是量多，可选的方案多。

[stable diffusion]浅谈AI游戏道具-在项目中的运用

用了2天训练了一个游戏道具图标的风格模型，训练集数量在500百多张素材，

训练参数为：

底模：revAnimated_v122EOL.safetensors

精细化打标训练集，每张图训练18次。共30轮，每10轮保存一次，batch size为2

learning rate为1e-4 unet-lr为1e-4 text-encoder-lr 为1e-5

networddim为128，netword-alpha为64

总步数：95904

训练环境使用的是揽睿星舟提供的服务，显卡为rtx3090ti-24G显存，耗时大概在5小时左右。

打标将每张图调整描述，去掉无意义的打标字母，结构为【触发词+属性词+特征描述词】，实际文生图效果非常好，能够根据文本描述精确生成相应效果资源。

AI如何具备创新绘画能力？

2023是AI绘画爆发的元年，具备创新能力可能还不足，尤其是stable diffusion属于稳定扩散模型，基于这个技术条件训练的模型都是投喂什么生成什么，投喂的质量越高、数量越多，效果也就越好。依靠质量和数量来弥补它的创造能力不足，也是一个巧妙的方法。

看好OPENAI的DALL·3的绘画能力，它是真正整合人类自然语言能力，理解能力更符合人心智的人工智能工具。我甚至尝试了将它用来绘画游戏图标，并在公司项目实际运用。也尝试用了为外贸儿童工厂制作儿童T恤（后续也会分享工作流程出来）。

AI在实际游戏道具图标流程中如何提升效率？

我觉得AI在游戏中的最大价值还是降本增效，创新能力一般。在实际项目中合理的将AI工具运用到工作上一定是前置AI工具的运用，例如在需求发起的同时，策划人员就可以利用SD(stable diffusion)生成一些参考图，甚至只需要简单修改即可使用。再分配给设计师修改，就能减少很大的工作量。

优化现有的工作流程，如果项目团队整体对AI都没有深入的研究，或者专注于手头的创新设计上更重要。那我建议应该成立一个AI小组中台，小组成员专注于AI工具的流程优化、模型训练积累经验，同时承担各个项目组的设计需求，磨合积累经验。小组成员中分配每个成员相应的接口人例如：图标接口人、角色立绘接口人、场景原画接口人等等，在需求对接上保证准确性和一致性。同时，定期与对接需求的相关人员每周复盘，总结当前存在的问题，使用了哪些新技术，优化了哪些流程，对比现有的效率库提升了多少。

要想让AI工具在项目中价值最大化，AI介入要有个清晰的流程梳理，同时与制作人、策划、美术总监、APM、UI设计师都要开会拉齐达成共识。

正常的需求发起一定是基于玩法创新，经过文案策划精心包装而来的需求，后置AI的介入只会让产出变得被动，因为AI的随机性、模型的稳定性、训练集质量问题目前很难100%的吻合需求。

Lora模型+风格迁移+contrlnet抽卡

当前主流的做法还是通过抽卡来获得更好的效果，得益于腾讯的IP-Adapter插件的分享，以及reference等参考模型。仅需风格迁移提升了一半的制作效率。再搭配contrlnet的linart权重适当调整，仅需要简单的草稿图、绘制一半的基础稿交给IP-Adapte就能几秒钟实现风格的一致。（后续也会分享详细教程）