pull/239/head
ThereforeGames 2024-03-06 19:39:11 -05:00
parent 2d4a184597
commit 0718bbabd8
93 changed files with 26901 additions and 148 deletions

View File

@ -80,7 +80,7 @@
{
"canny":"controlnet11Models_canny",
"depth":"controlnet11Models_depth",
"normalmap":"control_v11p_sd15_normalbae",
"normalmap":"controlnet11Models_normal",
"openpose":"controlnet11Models_openpose",
"mlsd":"",
"lineart":"controlnet11Models_animeline",

View File

@ -9,4 +9,4 @@ Unprompted is a powerful templating language written in Python. Unlike most temp
Software created by [Therefore Games](https://therefore.games). If you like my work, you can [sponsor the project on ☕ Github](https://github.com/sponsors/ThereforeGames) or [support me on <span class="patreon-symbol">┃🔴</span> Patreon](https://patreon.com/thereforegames). Thank you!
*Compatible with Python v3.10.6 and WebUI v1.6.0.*
*Compatible with Python v3.10.6 and WebUI v1.7.0.*

View File

@ -1,6 +1,28 @@
# Unprompted Announcements
Stay informed on the latest Unprompted news and updates.
<details><summary>Spice It Up - 6 March 2024</summary>
Hi folks,
I have just released Unprompted v10.7.0, which includes two notable features:
First, the **Magic Spice template** that aims to "beautify" your Stable Diffusion results using techniques from [Fooocus](https://github.com/lllyasviel/Fooocus) and elsewhere.
It can, for example: run a GPT-2 model to expand your prompt, automatically apply optimized Loras and embeddings, and even fix issues with image contrast. Here are some before/after examples using the `allspice_v1` preset:
![magic_spice_demo]([base_dir]/images/posts/magic_spice_demo.jpg)
This update also adds the `[autotone]` shortcode, which implements the Photoshop algorithm by the same name. It adjusts the black point of an image to enhance contrast. Particularly useful when working with low CFG or Loras that present gamma problems. Simply include `[after][autotone][/after]` in your prompts to engage the feature:
![autotone_demo]([base_dir]/images/posts/autotone_demo.png)
Finally, v10.7.0 addresses a few bugs and improves compatibility with the Forge WebUI.
Thank you for enjoying Unprompted.
</details>
<details><summary>Cool Autumn Update — 11 October 2023</summary>
Hi folks,

View File

@ -3,7 +3,42 @@ All notable changes to this project will be documented in this file.
For more details on new features, please check the [Manual](./MANUAL.md).
<details open><summary>10.6.0 - 1 December 2023</summary>
<details open><summary>10.7.0 - 6 March 2024</summary>
### Added
- New shortcode `[autotone]`: Adjusts the black point of the image to enhance contrast (should be placed inside an `[after]` block)
- New free template Magic Spice v0.0.1: Produces high-quality images regardless of the simplicity of your prompt, using ideas from Fooocus and elsewhere
- `[faceswap]`: Now supports the `gender_bonus` kwarg to boost facial similarity score when source and target genders are equal (compatible with insightface pipeline only)
- `[faceswap]`: Now supports the `age_influence` kwarg to penalize facial similarity score based on the difference of ages between source and target faces (compatible with insightface pipeline only)
- `[faceswap]`: Now supports the `prefer_gpu` kwarg to run inference on the video card if possible
- `[faceswap]`: The `make_embedding` option will now save gender and age values into the blended embedding
- `[faceswap]`: The insightface analyser is now properly cached, improving inference time significantly
- `[gpt]`: Now supports the `instruction` kwarg to help steer models that are capable of following instruction-response format prompts
- Added a customized `insightface_cuda` package that swaps hardcoded CPU references to CUDA equivalents
- Wizard UI now supports `_lines` and `_max_lines` to specify number of rows in a textbox UI element
- Unprompted now detects if you're using the Forge WebUI
- New txt2img preset `restart_fast_v1`
- New txt2img preset `dpm_lightning_8step_v1`: Uses the new Lightning sampler and Lora in Forge WebUI for super fast SDXL inference
- New helper function `str_to_rgb()`
- Facelift template banner image
### Changed
- `[gpt]`: The default GPT-2 model is now `LykosAI/GPT-Prompt-Expansion-Fooocus-v2`
- `[gpt]`: Renamed the `cache` parg to `unload` to match naming convention of other shortcodes
- Facelift template now defaults to the `fast_v1` preset
### Fixed
- The `wizard_generate_shortcode()` and `wizard_generate_template()` methods will no longer escape special HTML characters in the prompt
- `[after]`: Fixed compatibility issue with Forge WebUI
- `[faceswap]`: The `export_embedding` parg will now bypass the cache to avoid errors
- The `get_local_file_dir()` method now uses the `unprompted_dir` variable in case Unprompted is not in the usual `extensions` directory
### Removed
- Developer presets
</details>
<details><summary>10.6.0 - 1 December 2023</summary>
### Added
- New settings `Config.ui.wizard_shortcodes`, `Config.ui.wizard_templates`, `Config.ui.wizard_capture`: Allows you to disable certain Wizard tabs in order to improve WebUI performance

View File

@ -18,6 +18,8 @@ In the meantime, you can improve performance by disabling Wizard tabs you do not
To achieve compatibility between Unprompted and ControlNet, you must manually rename the `unprompted` extension folder to `_unprompted`. This is due to [a limitation in the Automatic1111 extension framework](https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/8011) whereby priority is determined alphabetically.
Additionally, if you're using the Forge WebUI, you should move `_unprompted` to `extensions-builtin/_unprompted` so that it can execute ahead of Forge's native ControlNet extension.
</details>
<details><summary>Compatibility with other extensions</summary>
@ -554,6 +556,8 @@ The `[set]` block supports `_show_label` which lets you toggle visibility of the
The `[set]` block supports `_info` which is descriptive text that will appear near the UI element.
The `[set]` block supports `_lines` and `_max_lines` to specify the number of rows shown in a `textbox` element.
Supports the `[wizard]` shortcode which will group the inner `[set]` blocks into a group UI element, the type of which is defined by the first parg: `accordion`, `row`, or `column`.
</details>
@ -750,6 +754,22 @@ RESULT: spelling is very difficult sometimes, okay!!!
</details>
<details><summary>[autotone]</summary>
Adjusts the black point of a given image for enhanced contrast. The algorithm produces results that are virtually identical to the **Image > Auto Tone** feature in Photoshop.
Supports the `file` kwarg which is the filepath to an image to modify. Defaults to the Stable Diffusion output.
Supports the `show` parg which will append the original image to the output window.
Supports the `out` kwarg which is a location to save the modified image to.
```
[after][autotone][/after]
```
</details>
<details><summary>[bypass]</summary>
Allows you to disable the execution of specific shortcodes for the remainder of the run. It is similar to `[override]`, but for shortcodes instead of variables. Particularly useful for debugging purposes.
@ -1131,23 +1151,25 @@ My name is [get name]
<details><summary>[gpt]</summary>
Processes the content with a given GPT model. This is similar to the "Magic Prompts" feature of Dynamic Prompts, if you're familiar with that.
Processes the content with a given GPT-2 model. This is similar to the "Magic Prompts" feature of Dynamic Prompts, if you're familiar with that.
This shortcode requires the "transformers" package which is included with the WebUI by default, but you may need to install the package manually if you're using Unprompted as a standalone program.
You can leave the content blank for a completely randomized prompt.
Supports the `model` kwarg which can accept a pretrained model identifier from the HuggingFace hub. Defaults to `Gustavosta/MagicPrompt-Stable-Diffusion`. The first time you use a new model, it will be downloaded to the `unprompted/models/gpt` folder.
Supports the `model` kwarg which can accept a pretrained model identifier from the HuggingFace hub. Defaults to `LykosAI/GPT-Prompt-Expansion-Fooocus-v2`. The first time you use a new model, it will be downloaded to the `unprompted/models/gpt` folder.
Please see the Wizard UI for a list of suggested models.
Supports the `task` kwarg which determines behavior of the transformers pipeline module. Defaults to `text-generation`. You can set this to `summarization` if you want to shorten your prompts a la Midjourney.
Supports the `instruction` kwarg which is a string to be prepended to the prompt. This text will be excluded from the final result. Example: `[gpt instruction="Generate a list of animals"]cat,[/gpt]` may return `cat, dog, bird, horse, cow`.
Supports the `max_length` kwarg which is the maximum number of words to be returned by the shortcode. Defaults to 50.
Supports the `min_length` kwarg which is the minimum number of words to be returned by the shortcode. Defaults to 1.
Supports the `cache` parg to keep the model and tokenizer in memory between runs.
Supports the `unload` parg to prevent keeping the model and tokenizer in memory between runs.
</details>
@ -1689,7 +1711,7 @@ All of your kwargs are sent as URL parameters to the API (with the exception of
Supports shorthand syntax with pargs, where the first parg is `types` (e.g. LORA or TextualInversion), the second parg is `query` (model name search terms), the third parg is `_weight` (optional, defaults to 1.0), and the fourth parg (also optional) is the `_file`. For example: `[civitai lora EasyNegative 0.5]`.
The `query` value is used as the filename to look for on your filesystem. You can typically search Civitai for a direct model filename (e.g. `query="kkw-new-neg-v1.4"` will return the 'New Negative' model). However, if this isn't working for whatever reason, you can override the filesystem search with the `_file` kwarg: `[civitai query="New Negative" _file="kkw-new-neg-v1.4"]` - but consider this a last resort!
The `query` value is used as the filename to look for on your filesystem. You can typically search Civitai for a direct model filename (e.g. `query="kkw-new-neg-v1.4"` will return the 'New Negative' model). However, if this isn't working for whatever reason, you can override the filesystem search with the `_file` kwarg: `[civitai query="New Negative" _file="kkw-new-neg-v1.4"]`.
This shortcode will auto-correct the case-sensitivity of `types` to the API's expected format. The API is a bit inconsistent in this regard (e.g. lora = `LORA`, controlnet = `Controlnet`, aestheticgradient = `AestheticGradient`...) but Unprompted will handle it for you. Here are the other edge cases that Unprompted will catch:
@ -1734,11 +1756,15 @@ The `insightface` pipeline is currently the most developed option as it supports
- It supports the `minimum_similarity` kwarg to bypass the faceswap if no one in the target picture bears resemblance to the new face. This kwarg takes a float value, although I haven't determined the upper and lower boundaries yet. A greater value means "more similar" and the range appears to be something like -10 to 300.
- It supports the `export_embedding` parg which takes the average of all input faces and exports it to a safetensors embedding file. This file represents a composite face that can be used in lieu of individual images.
- It supports the `embedding_path` kwarg which is the path to use in conjunction with `export_embedding`. Defaults to `unprompted/user/faces/blended_faces.safetensors`.
- It supports the `gender_bonus` kwarg to boost facial similarity score when source and target genders are equal.
- It supports the `age_influence` kwarg to penalize facial similarity score based on the difference of ages between source and target faces.
Supports the `visibility` kwarg which is the alpha value with which to blend the result back into the original image. Defaults to 1.0.
Supports the `unload` kwarg which allows you to free some or all of the faceswap components after inference. Useful for low memory devices, but will increase inference time. You can pass the following as a delimited string with `Config.syntax.delimiter`: `model`, `face`, `all`.
Supports the `prefer_gpu` kwarg to run on the video card whenever possible.
It is recommended to follow this shortcode with `[restore_faces]` in order to improve the resolution of the swapped result. Or, use the included Facelift template as an all-in-one solution.
Additional pipelines may be supported in the future. Attempts were made to implement support for SimSwap, however this proved challenging due to multiple dependency conflicts.

Binary file not shown.

After

Width:  |  Height:  |  Size: 988 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 666 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

View File

@ -9,6 +9,7 @@ pil_resampling_dict["Hamming"] = 5
pil_resampling_dict["Bicubic"] = 3
pil_resampling_dict["Lanczos"] = 1
def strip_str(string, chop):
"""Removes substring `chop` from the beginning or end of given `string`"""
while True:
@ -23,10 +24,12 @@ def strip_str(string, chop):
break
return string
def sigmoid(x):
import math
return 1 / (1 + math.exp(-x))
def is_equal(var_a, var_b):
"""Checks if two variables equal each other, taking care to account for datatypes."""
if (is_float(var_a)): var_a = float(var_a)
@ -57,13 +60,15 @@ def is_int(value):
except:
return False
def ensure(var,datatype):
def ensure(var, datatype):
"""Ensures that a variable is a given datatype"""
if isinstance(var, datatype): return var
else:
if datatype == list: return [var]
return datatype(var)
def autocast(var):
"""Converts a variable between string, int, and float depending on how it's formatted"""
original_var = var
@ -74,23 +79,36 @@ def autocast(var):
elif (is_int(var)): var = int(var)
return (var)
def pil_to_cv2(img):
import cv2, numpy
return cv2.cvtColor(numpy.array(img), cv2.COLOR_RGB2BGR)
def cv2_to_pil(img):
import cv2
from PIL import Image
return Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
def str_to_rgb(color_string):
"""Converts a color string to a tuple of RGB values"""
if color_string[0].isdigit():
return tuple(map(int, color_string.split(',')))
elif color_string.startswith("#"):
return bytes.fromhex(color_string[1:])
def get_logger(logger=None):
if not logger:
try:
import logging
logger = logging.getLogger("Unprompted").info
except: logger = print
except:
logger = print
return logger
def download_file(filename, url, logger=None, overwrite=False, headers=None):
import os, requests
@ -101,7 +119,7 @@ def download_file(filename, url, logger=None, overwrite=False, headers=None):
os.makedirs(os.path.dirname(os.path.abspath(filename)), exist_ok=True)
log(f"Downloading file into: {filename}...")
response = requests.get(url, stream=True,headers=headers)
response = requests.get(url, stream=True, headers=headers)
if response.status_code != 200:
log(f"Problematic status code received: {response.status_code}")
return False
@ -112,6 +130,7 @@ def download_file(filename, url, logger=None, overwrite=False, headers=None):
fout.write(block)
return True
def import_file(full_name, path):
"""Allows importing of modules from full filepath, not sure why Python requires a helper function for this in 2023"""
from importlib import util
@ -129,6 +148,7 @@ def list_set(this_list, index, value, null_value=False):
this_list.append(null_value)
this_list[index] = value
def str_with_ext(path, default_ext=".json"):
import os
if os.path.exists(path) or default_ext in path:
@ -150,6 +170,7 @@ def create_load_json(file_path, default_data={}, encoding="utf8"):
return data
def unsharp_mask(image, amount=1.0, kernel_size=(5, 5), sigma=1.0, threshold=0):
"""Return a sharpened version of the image, using an unsharp mask."""
import numpy, cv2
@ -165,10 +186,11 @@ def unsharp_mask(image, amount=1.0, kernel_size=(5, 5), sigma=1.0, threshold=0):
numpy.copyto(sharpened, image, where=low_contrast_mask)
return Image.fromarray(sharpened)
# Helper class that converts kwargs to attribute notation
# Many libraries expect to be fed options with argparse,
# which is not so straightforward inside of an A1111 extension
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
self.__dict__ = self
self.__dict__ = self

View File

@ -0,0 +1,21 @@
# coding: utf-8
# pylint: disable=wrong-import-position
"""InsightFace: A Face Analysis Toolkit."""
from __future__ import absolute_import
try:
#import mxnet as mx
import onnxruntime
except ImportError:
raise ImportError(
"Unable to import dependency onnxruntime. "
)
__version__ = '0.7.3'
from . import model_zoo
from . import utils
from . import app
from . import data
from . import thirdparty

View File

@ -0,0 +1,2 @@
from .face_analysis import *
from .mask_renderer import *

View File

@ -0,0 +1,49 @@
import numpy as np
from numpy.linalg import norm as l2norm
#from easydict import EasyDict
class Face(dict):
def __init__(self, d=None, **kwargs):
if d is None:
d = {}
if kwargs:
d.update(**kwargs)
for k, v in d.items():
setattr(self, k, v)
# Class attributes
#for k in self.__class__.__dict__.keys():
# if not (k.startswith('__') and k.endswith('__')) and not k in ('update', 'pop'):
# setattr(self, k, getattr(self, k))
def __setattr__(self, name, value):
if isinstance(value, (list, tuple)):
value = [self.__class__(x)
if isinstance(x, dict) else x for x in value]
elif isinstance(value, dict) and not isinstance(value, self.__class__):
value = self.__class__(value)
super(Face, self).__setattr__(name, value)
super(Face, self).__setitem__(name, value)
__setitem__ = __setattr__
def __getattr__(self, name):
return None
@property
def embedding_norm(self):
if self.embedding is None:
return None
return l2norm(self.embedding)
@property
def normed_embedding(self):
if self.embedding is None:
return None
return self.embedding / self.embedding_norm
@property
def sex(self):
if self.gender is None:
return None
return 'M' if self.gender==1 else 'F'

View File

@ -0,0 +1,109 @@
# -*- coding: utf-8 -*-
# @Organization : insightface.ai
# @Author : Jia Guo
# @Time : 2021-05-04
# @Function :
from __future__ import division
import glob
import os.path as osp
import numpy as np
import onnxruntime
from numpy.linalg import norm
from ..model_zoo import model_zoo
from ..utils import DEFAULT_MP_NAME, ensure_available
from .common import Face
__all__ = ['FaceAnalysis']
class FaceAnalysis:
def __init__(self, name=DEFAULT_MP_NAME, root='~/.insightface', allowed_modules=None, **kwargs):
onnxruntime.set_default_logger_severity(3)
self.models = {}
self.model_dir = ensure_available('models', name, root=root)
onnx_files = glob.glob(osp.join(self.model_dir, '*.onnx'))
onnx_files = sorted(onnx_files)
for onnx_file in onnx_files:
model = model_zoo.get_model(onnx_file, **kwargs)
if model is None:
print('model not recognized:', onnx_file)
elif allowed_modules is not None and model.taskname not in allowed_modules:
print('model ignore:', onnx_file, model.taskname)
del model
elif model.taskname not in self.models and (allowed_modules is None or model.taskname in allowed_modules):
print('find model:', onnx_file, model.taskname, model.input_shape, model.input_mean, model.input_std)
self.models[model.taskname] = model
else:
print('duplicated model task type, ignore:', onnx_file, model.taskname)
del model
assert 'detection' in self.models
self.det_model = self.models['detection']
def prepare(self, ctx_id, det_thresh=0.5, det_size=(640, 640)):
self.det_thresh = det_thresh
assert det_size is not None
print('set det-size:', det_size)
self.det_size = det_size
for taskname, model in self.models.items():
if taskname=='detection':
model.prepare(ctx_id, input_size=det_size, det_thresh=det_thresh)
else:
model.prepare(ctx_id)
def get(self, img, max_num=0):
bboxes, kpss = self.det_model.detect(img,
max_num=max_num,
metric='default')
if bboxes.shape[0] == 0:
return []
ret = []
for i in range(bboxes.shape[0]):
bbox = bboxes[i, 0:4]
det_score = bboxes[i, 4]
kps = None
if kpss is not None:
kps = kpss[i]
face = Face(bbox=bbox, kps=kps, det_score=det_score)
for taskname, model in self.models.items():
if taskname=='detection':
continue
model.get(img, face)
ret.append(face)
return ret
def draw_on(self, img, faces):
import cv2
dimg = img.copy()
for i in range(len(faces)):
face = faces[i]
box = face.bbox.astype(np.int)
color = (0, 0, 255)
cv2.rectangle(dimg, (box[0], box[1]), (box[2], box[3]), color, 2)
if face.kps is not None:
kps = face.kps.astype(np.int)
#print(landmark.shape)
for l in range(kps.shape[0]):
color = (0, 0, 255)
if l == 0 or l == 3:
color = (0, 255, 0)
cv2.circle(dimg, (kps[l][0], kps[l][1]), 1, color,
2)
if face.gender is not None and face.age is not None:
cv2.putText(dimg,'%s,%d'%(face.sex,face.age), (box[0]-1, box[1]-4),cv2.FONT_HERSHEY_COMPLEX,0.7,(0,255,0),1)
#for key, value in face.items():
# if key.startswith('landmark_3d'):
# print(key, value.shape)
# print(value[0:10,:])
# lmk = np.round(value).astype(np.int)
# for l in range(lmk.shape[0]):
# color = (255, 0, 0)
# cv2.circle(dimg, (lmk[l][0], lmk[l][1]), 1, color,
# 2)
return dimg

View File

@ -0,0 +1,232 @@
import os, sys, datetime
import numpy as np
import os.path as osp
import albumentations as A
from albumentations.core.transforms_interface import ImageOnlyTransform
from .face_analysis import FaceAnalysis
from ..utils import get_model_dir
from ..thirdparty import face3d
from ..data import get_image as ins_get_image
from ..utils import DEFAULT_MP_NAME
import cv2
class MaskRenderer:
def __init__(self, name=DEFAULT_MP_NAME, root='~/.insightface', insfa=None):
#if insfa is None, enter render_only mode
self.mp_name = name
self.root = root
self.insfa = insfa
model_dir = get_model_dir(name, root)
bfm_file = osp.join(model_dir, 'BFM.mat')
assert osp.exists(bfm_file), 'should contains BFM.mat in your model directory'
self.bfm = face3d.morphable_model.MorphabelModel(bfm_file)
self.index_ind = self.bfm.kpt_ind
bfm_uv_file = osp.join(model_dir, 'BFM_UV.mat')
assert osp.exists(bfm_uv_file), 'should contains BFM_UV.mat in your model directory'
uv_coords = face3d.morphable_model.load.load_uv_coords(bfm_uv_file)
self.uv_size = (224,224)
self.mask_stxr = 0.1
self.mask_styr = 0.33
self.mask_etxr = 0.9
self.mask_etyr = 0.7
self.tex_h , self.tex_w, self.tex_c = self.uv_size[1] , self.uv_size[0],3
texcoord = np.zeros_like(uv_coords)
texcoord[:, 0] = uv_coords[:, 0] * (self.tex_h - 1)
texcoord[:, 1] = uv_coords[:, 1] * (self.tex_w - 1)
texcoord[:, 1] = self.tex_w - texcoord[:, 1] - 1
self.texcoord = np.hstack((texcoord, np.zeros((texcoord.shape[0], 1))))
self.X_ind = self.bfm.kpt_ind
self.mask_image_names = ['mask_white', 'mask_blue', 'mask_black', 'mask_green']
self.mask_aug_probs = [0.4, 0.4, 0.1, 0.1]
#self.mask_images = []
#self.mask_images_rgb = []
#for image_name in mask_image_names:
# mask_image = ins_get_image(image_name)
# self.mask_images.append(mask_image)
# mask_image_rgb = mask_image[:,:,::-1]
# self.mask_images_rgb.append(mask_image_rgb)
def prepare(self, ctx_id=0, det_thresh=0.5, det_size=(128, 128)):
self.pre_ctx_id = ctx_id
self.pre_det_thresh = det_thresh
self.pre_det_size = det_size
def transform(self, shape3D, R):
s = 1.0
shape3D[:2, :] = shape3D[:2, :]
shape3D = s * np.dot(R, shape3D)
return shape3D
def preprocess(self, vertices, w, h):
R1 = face3d.mesh.transform.angle2matrix([0, 180, 180])
t = np.array([-w // 2, -h // 2, 0])
vertices = vertices.T
vertices += t
vertices = self.transform(vertices.T, R1).T
return vertices
def project_to_2d(self,vertices,s,angles,t):
transformed_vertices = self.bfm.transform(vertices, s, angles, t)
projected_vertices = transformed_vertices.copy() # using stantard camera & orth projection
return projected_vertices[self.bfm.kpt_ind, :2]
def params_to_vertices(self,params , H , W):
fitted_sp, fitted_ep, fitted_s, fitted_angles, fitted_t = params
fitted_vertices = self.bfm.generate_vertices(fitted_sp, fitted_ep)
transformed_vertices = self.bfm.transform(fitted_vertices, fitted_s, fitted_angles,
fitted_t)
transformed_vertices = self.preprocess(transformed_vertices.T, W, H)
image_vertices = face3d.mesh.transform.to_image(transformed_vertices, H, W)
return image_vertices
def draw_lmk(self, face_image):
faces = self.insfa.get(face_image, max_num=1)
if len(faces)==0:
return face_image
return self.insfa.draw_on(face_image, faces)
def build_params(self, face_image):
#landmark = self.if3d68_handler.get(face_image)
#if landmark is None:
# return None #face not found
if self.insfa is None:
self.insfa = FaceAnalysis(name=self.mp_name, root=self.root, allowed_modules=['detection', 'landmark_3d_68'])
self.insfa.prepare(ctx_id=self.pre_ctx_id, det_thresh=self.pre_det_thresh, det_size=self.pre_det_size)
faces = self.insfa.get(face_image, max_num=1)
if len(faces)==0:
return None
landmark = faces[0].landmark_3d_68[:,:2]
fitted_sp, fitted_ep, fitted_s, fitted_angles, fitted_t = self.bfm.fit(landmark, self.X_ind, max_iter = 3)
return [fitted_sp, fitted_ep, fitted_s, fitted_angles, fitted_t]
def generate_mask_uv(self,mask, positions):
uv_size = (self.uv_size[1], self.uv_size[0], 3)
h, w, c = uv_size
uv = np.zeros(shape=(self.uv_size[1],self.uv_size[0], 3), dtype=np.uint8)
stxr, styr = positions[0], positions[1]
etxr, etyr = positions[2], positions[3]
stx, sty = int(w * stxr), int(h * styr)
etx, ety = int(w * etxr), int(h * etyr)
height = ety - sty
width = etx - stx
mask = cv2.resize(mask, (width, height))
uv[sty:ety, stx:etx] = mask
return uv
def render_mask(self,face_image, mask_image, params, input_is_rgb=False, auto_blend = True, positions=[0.1, 0.33, 0.9, 0.7]):
if isinstance(mask_image, str):
to_rgb = True if input_is_rgb else False
mask_image = ins_get_image(mask_image, to_rgb=to_rgb)
uv_mask_image = self.generate_mask_uv(mask_image, positions)
h,w,c = face_image.shape
image_vertices = self.params_to_vertices(params ,h,w)
output = (1-face3d.mesh.render.render_texture(image_vertices, self.bfm.full_triangles , uv_mask_image, self.texcoord, self.bfm.full_triangles, h , w ))*255
output = output.astype(np.uint8)
if auto_blend:
mask_bd = (output==255).astype(np.uint8)
final = face_image*mask_bd + (1-mask_bd)*output
return final
return output
#def mask_augmentation(self, face_image, label, input_is_rgb=False, p=0.1):
# if np.random.random()<p:
# assert isinstance(label, (list, np.ndarray)), 'make sure the rec dataset includes mask params'
# assert len(label)==237 or len(lable)==235, 'make sure the rec dataset includes mask params'
# if len(label)==237:
# if label[1]<0.0: #invalid label for mask aug
# return face_image
# label = label[2:]
# params = self.decode_params(label)
# mask_image_name = np.random.choice(self.mask_image_names, p=self.mask_aug_probs)
# pos = np.random.uniform(0.33, 0.5)
# face_image = self.render_mask(face_image, mask_image_name, params, input_is_rgb=input_is_rgb, positions=[0.1, pos, 0.9, 0.7])
# return face_image
@staticmethod
def encode_params(params):
p0 = list(params[0])
p1 = list(params[1])
p2 = [float(params[2])]
p3 = list(params[3])
p4 = list(params[4])
return p0+p1+p2+p3+p4
@staticmethod
def decode_params(params):
p0 = params[0:199]
p0 = np.array(p0, dtype=np.float32).reshape( (-1, 1))
p1 = params[199:228]
p1 = np.array(p1, dtype=np.float32).reshape( (-1, 1))
p2 = params[228]
p3 = tuple(params[229:232])
p4 = params[232:235]
p4 = np.array(p4, dtype=np.float32).reshape( (-1, 1))
return p0, p1, p2, p3, p4
class MaskAugmentation(ImageOnlyTransform):
def __init__(
self,
mask_names=['mask_white', 'mask_blue', 'mask_black', 'mask_green'],
mask_probs=[0.4,0.4,0.1,0.1],
h_low = 0.33,
h_high = 0.35,
always_apply=False,
p=1.0,
):
super(MaskAugmentation, self).__init__(always_apply, p)
self.renderer = MaskRenderer()
assert len(mask_names)>0
assert len(mask_names)==len(mask_probs)
self.mask_names = mask_names
self.mask_probs = mask_probs
self.h_low = h_low
self.h_high = h_high
#self.hlabel = None
def apply(self, image, hlabel, mask_name, h_pos, **params):
#print(params.keys())
#hlabel = params.get('hlabel')
assert len(hlabel)==237 or len(hlabel)==235, 'make sure the rec dataset includes mask params'
if len(hlabel)==237:
if hlabel[1]<0.0:
return image
hlabel = hlabel[2:]
#print(len(hlabel))
mask_params = self.renderer.decode_params(hlabel)
image = self.renderer.render_mask(image, mask_name, mask_params, input_is_rgb=True, positions=[0.1, h_pos, 0.9, 0.7])
return image
@property
def targets_as_params(self):
return ["image", "hlabel"]
def get_params_dependent_on_targets(self, params):
hlabel = params['hlabel']
mask_name = np.random.choice(self.mask_names, p=self.mask_probs)
h_pos = np.random.uniform(self.h_low, self.h_high)
return {'hlabel': hlabel, 'mask_name': mask_name, 'h_pos': h_pos}
def get_transform_init_args_names(self):
#return ("hlabel", 'mask_names', 'mask_probs', 'h_low', 'h_high')
return ('mask_names', 'mask_probs', 'h_low', 'h_high')
if __name__ == "__main__":
tool = MaskRenderer('antelope')
tool.prepare(det_size=(128,128))
image = cv2.imread("Tom_Hanks_54745.png")
params = tool.build_params(image)
#out = tool.draw_lmk(image)
#cv2.imwrite('output_lmk.jpg', out)
#mask_image = cv2.imread("masks/mask1.jpg")
#mask_image = cv2.imread("masks/black-mask.png")
#mask_image = cv2.imread("masks/mask2.jpg")
mask_out = tool.render_mask(image, 'mask_blue', params)# use single thread to test the time cost
cv2.imwrite('output_mask.jpg', mask_out)

View File

@ -0,0 +1,13 @@
from abc import ABC, abstractmethod
from argparse import ArgumentParser
class BaseInsightFaceCLICommand(ABC):
@staticmethod
@abstractmethod
def register_subcommand(parser: ArgumentParser):
raise NotImplementedError()
@abstractmethod
def run(self):
raise NotImplementedError()

View File

@ -0,0 +1,29 @@
#!/usr/bin/env python
from argparse import ArgumentParser
from .model_download import ModelDownloadCommand
from .rec_add_mask_param import RecAddMaskParamCommand
def main():
parser = ArgumentParser("InsightFace CLI tool", usage="insightface-cli <command> [<args>]")
commands_parser = parser.add_subparsers(help="insightface-cli command-line helpers")
# Register commands
ModelDownloadCommand.register_subcommand(commands_parser)
RecAddMaskParamCommand.register_subcommand(commands_parser)
args = parser.parse_args()
if not hasattr(args, "func"):
parser.print_help()
exit(1)
# Run
service = args.func(args)
service.run()
if __name__ == "__main__":
main()

View File

@ -0,0 +1,36 @@
from argparse import ArgumentParser
from . import BaseInsightFaceCLICommand
import os
import os.path as osp
import zipfile
import glob
from ..utils import download
def model_download_command_factory(args):
return ModelDownloadCommand(args.model, args.root, args.force)
class ModelDownloadCommand(BaseInsightFaceCLICommand):
#_url_format = '{repo_url}models/{file_name}.zip'
@staticmethod
def register_subcommand(parser: ArgumentParser):
download_parser = parser.add_parser("model.download")
download_parser.add_argument(
"--root", type=str, default='~/.insightface', help="Path to location to store the models"
)
download_parser.add_argument(
"--force", action="store_true", help="Force the model to be download even if already in root-dir"
)
download_parser.add_argument("model", type=str, help="Name of the model to download")
download_parser.set_defaults(func=model_download_command_factory)
def __init__(self, model: str, root: str, force: bool):
self._model = model
self._root = root
self._force = force
def run(self):
download('models', self._model, force=self._force, root=self._root)

View File

@ -0,0 +1,94 @@
import numbers
import os
from argparse import ArgumentParser, Namespace
import mxnet as mx
import numpy as np
from ..app import MaskRenderer
from ..data.rec_builder import RecBuilder
from . import BaseInsightFaceCLICommand
def rec_add_mask_param_command_factory(args: Namespace):
return RecAddMaskParamCommand(
args.input, args.output
)
class RecAddMaskParamCommand(BaseInsightFaceCLICommand):
@staticmethod
def register_subcommand(parser: ArgumentParser):
_parser = parser.add_parser("rec.addmaskparam")
_parser.add_argument("input", type=str, help="input rec")
_parser.add_argument("output", type=str, help="output rec, with mask param")
_parser.set_defaults(func=rec_add_mask_param_command_factory)
def __init__(
self,
input: str,
output: str,
):
self._input = input
self._output = output
def run(self):
tool = MaskRenderer()
tool.prepare(ctx_id=0, det_size=(128,128))
root_dir = self._input
path_imgrec = os.path.join(root_dir, 'train.rec')
path_imgidx = os.path.join(root_dir, 'train.idx')
imgrec = mx.recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, 'r')
save_path = self._output
wrec=RecBuilder(path=save_path)
s = imgrec.read_idx(0)
header, _ = mx.recordio.unpack(s)
if header.flag > 0:
if len(header.label)==2:
imgidx = np.array(range(1, int(header.label[0])))
else:
imgidx = np.array(list(self.imgrec.keys))
else:
imgidx = np.array(list(self.imgrec.keys))
stat = [0, 0]
print('total:', len(imgidx))
for iid, idx in enumerate(imgidx):
#if iid==500000:
# break
if iid%1000==0:
print('processing:', iid)
s = imgrec.read_idx(idx)
header, img = mx.recordio.unpack(s)
label = header.label
if not isinstance(label, numbers.Number):
label = label[0]
sample = mx.image.imdecode(img).asnumpy()
bgr = sample[:,:,::-1]
params = tool.build_params(bgr)
#if iid<10:
# mask_out = tool.render_mask(bgr, 'mask_blue', params)
# cv2.imwrite('maskout_%d.jpg'%iid, mask_out)
stat[1] += 1
if params is None:
wlabel = [label] + [-1.0]*236
stat[0] += 1
else:
#print(0, params[0].shape, params[0].dtype)
#print(1, params[1].shape, params[1].dtype)
#print(2, params[2])
#print(3, len(params[3]), params[3][0].__class__)
#print(4, params[4].shape, params[4].dtype)
mask_label = tool.encode_params(params)
wlabel = [label, 0.0]+mask_label # 237 including idlabel, total mask params size is 235
if iid==0:
print('param size:', len(mask_label), len(wlabel), label)
assert len(wlabel)==237
wrec.add_image(img, wlabel)
#print(len(params))
wrec.close()
print('finished on', self._output, ', failed:', stat[0])

View File

@ -0,0 +1,2 @@
from .image import get_image
from .pickle_object import get_object

View File

@ -0,0 +1,27 @@
import cv2
import os
import os.path as osp
from pathlib import Path
class ImageCache:
data = {}
def get_image(name, to_rgb=False):
key = (name, to_rgb)
if key in ImageCache.data:
return ImageCache.data[key]
images_dir = osp.join(Path(__file__).parent.absolute(), 'images')
ext_names = ['.jpg', '.png', '.jpeg']
image_file = None
for ext_name in ext_names:
_image_file = osp.join(images_dir, "%s%s"%(name, ext_name))
if osp.exists(_image_file):
image_file = _image_file
break
assert image_file is not None, '%s not found'%name
img = cv2.imread(image_file)
if to_rgb:
img = img[:,:,::-1]
ImageCache.data[key] = img
return img

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

View File

@ -0,0 +1,17 @@
import cv2
import os
import os.path as osp
from pathlib import Path
import pickle
def get_object(name):
objects_dir = osp.join(Path(__file__).parent.absolute(), 'objects')
if not name.endswith('.pkl'):
name = name+".pkl"
filepath = osp.join(objects_dir, name)
if not osp.exists(filepath):
return None
with open(filepath, 'rb') as f:
obj = pickle.load(f)
return obj

View File

@ -0,0 +1,71 @@
import pickle
import numpy as np
import os
import os.path as osp
import sys
import mxnet as mx
class RecBuilder():
def __init__(self, path, image_size=(112, 112)):
self.path = path
self.image_size = image_size
self.widx = 0
self.wlabel = 0
self.max_label = -1
assert not osp.exists(path), '%s exists' % path
os.makedirs(path)
self.writer = mx.recordio.MXIndexedRecordIO(os.path.join(path, 'train.idx'),
os.path.join(path, 'train.rec'),
'w')
self.meta = []
def add(self, imgs):
#!!! img should be BGR!!!!
#assert label >= 0
#assert label > self.last_label
assert len(imgs) > 0
label = self.wlabel
for img in imgs:
idx = self.widx
image_meta = {'image_index': idx, 'image_classes': [label]}
header = mx.recordio.IRHeader(0, label, idx, 0)
if isinstance(img, np.ndarray):
s = mx.recordio.pack_img(header,img,quality=95,img_fmt='.jpg')
else:
s = mx.recordio.pack(header, img)
self.writer.write_idx(idx, s)
self.meta.append(image_meta)
self.widx += 1
self.max_label = label
self.wlabel += 1
def add_image(self, img, label):
#!!! img should be BGR!!!!
#assert label >= 0
#assert label > self.last_label
idx = self.widx
header = mx.recordio.IRHeader(0, label, idx, 0)
if isinstance(label, list):
idlabel = label[0]
else:
idlabel = label
image_meta = {'image_index': idx, 'image_classes': [idlabel]}
if isinstance(img, np.ndarray):
s = mx.recordio.pack_img(header,img,quality=95,img_fmt='.jpg')
else:
s = mx.recordio.pack(header, img)
self.writer.write_idx(idx, s)
self.meta.append(image_meta)
self.widx += 1
self.max_label = max(self.max_label, idlabel)
def close(self):
with open(osp.join(self.path, 'train.meta'), 'wb') as pfile:
pickle.dump(self.meta, pfile, protocol=pickle.HIGHEST_PROTOCOL)
print('stat:', self.widx, self.wlabel)
with open(os.path.join(self.path, 'property'), 'w') as f:
f.write("%d,%d,%d\n" % (self.max_label+1, self.image_size[0], self.image_size[1]))
f.write("%d\n" % (self.widx))

View File

@ -0,0 +1,6 @@
from .model_zoo import get_model
from .arcface_onnx import ArcFaceONNX
from .retinaface import RetinaFace
from .scrfd import SCRFD
from .landmark import Landmark
from .attribute import Attribute

View File

@ -0,0 +1,89 @@
# -*- coding: utf-8 -*-
# @Organization : insightface.ai
# @Author : Jia Guo
# @Time : 2021-05-04
# @Function :
from __future__ import division
import numpy as np
import cv2
import onnx
import onnxruntime
from ..utils import face_align
__all__ = [
'ArcFaceONNX',
]
class ArcFaceONNX:
def __init__(self, model_file=None, session=None):
assert model_file is not None
self.model_file = model_file
self.session = session
self.taskname = 'recognition'
find_sub = False
find_mul = False
model = onnx.load(self.model_file)
graph = model.graph
for nid, node in enumerate(graph.node[:8]):
#print(nid, node.name)
if node.name.startswith('Sub') or node.name.startswith('_minus'):
find_sub = True
if node.name.startswith('Mul') or node.name.startswith('_mul'):
find_mul = True
if find_sub and find_mul:
#mxnet arcface model
input_mean = 0.0
input_std = 1.0
else:
input_mean = 127.5
input_std = 127.5
self.input_mean = input_mean
self.input_std = input_std
#print('input mean and std:', self.input_mean, self.input_std)
if self.session is None:
self.session = onnxruntime.InferenceSession(self.model_file, None)
input_cfg = self.session.get_inputs()[0]
input_shape = input_cfg.shape
input_name = input_cfg.name
self.input_size = tuple(input_shape[2:4][::-1])
self.input_shape = input_shape
outputs = self.session.get_outputs()
output_names = []
for out in outputs:
output_names.append(out.name)
self.input_name = input_name
self.output_names = output_names
assert len(self.output_names) == 1
self.output_shape = outputs[0].shape
def prepare(self, ctx_id, **kwargs):
if ctx_id < 0:
self.session.set_providers(['CUDAExecutionProvider'])
def get(self, img, face):
aimg = face_align.norm_crop(img, landmark=face.kps, image_size=self.input_size[0])
face.embedding = self.get_feat(aimg).flatten()
return face.embedding
def compute_sim(self, feat1, feat2):
from numpy.linalg import norm
feat1 = feat1.ravel()
feat2 = feat2.ravel()
sim = np.dot(feat1, feat2) / (norm(feat1) * norm(feat2))
return sim
def get_feat(self, imgs):
if not isinstance(imgs, list):
imgs = [imgs]
input_size = self.input_size
blob = cv2.dnn.blobFromImages(imgs, 1.0 / self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
return net_out
def forward(self, batch_data):
blob = (batch_data - self.input_mean) / self.input_std
net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
return net_out

View File

@ -0,0 +1,92 @@
# -*- coding: utf-8 -*-
# @Organization : insightface.ai
# @Author : Jia Guo
# @Time : 2021-06-19
# @Function :
from __future__ import division
import numpy as np
import cv2
import onnx
import onnxruntime
from ..utils import face_align
__all__ = [
'Attribute',
]
class Attribute:
def __init__(self, model_file=None, session=None):
assert model_file is not None
self.model_file = model_file
self.session = session
find_sub = False
find_mul = False
model = onnx.load(self.model_file)
graph = model.graph
for nid, node in enumerate(graph.node[:8]):
#print(nid, node.name)
if node.name.startswith('Sub') or node.name.startswith('_minus'):
find_sub = True
if node.name.startswith('Mul') or node.name.startswith('_mul'):
find_mul = True
if nid < 3 and node.name == 'bn_data':
find_sub = True
find_mul = True
if find_sub and find_mul:
#mxnet arcface model
input_mean = 0.0
input_std = 1.0
else:
input_mean = 127.5
input_std = 128.0
self.input_mean = input_mean
self.input_std = input_std
#print('input mean and std:', model_file, self.input_mean, self.input_std)
if self.session is None:
self.session = onnxruntime.InferenceSession(self.model_file, None)
input_cfg = self.session.get_inputs()[0]
input_shape = input_cfg.shape
input_name = input_cfg.name
self.input_size = tuple(input_shape[2:4][::-1])
self.input_shape = input_shape
outputs = self.session.get_outputs()
output_names = []
for out in outputs:
output_names.append(out.name)
self.input_name = input_name
self.output_names = output_names
assert len(self.output_names) == 1
output_shape = outputs[0].shape
#print('init output_shape:', output_shape)
if output_shape[1] == 3:
self.taskname = 'genderage'
else:
self.taskname = 'attribute_%d' % output_shape[1]
def prepare(self, ctx_id, **kwargs):
if ctx_id < 0:
self.session.set_providers(['CUDAExecutionProvider'])
def get(self, img, face):
bbox = face.bbox
w, h = (bbox[2] - bbox[0]), (bbox[3] - bbox[1])
center = (bbox[2] + bbox[0]) / 2, (bbox[3] + bbox[1]) / 2
rotate = 0
_scale = self.input_size[0] / (max(w, h) * 1.5)
#print('param:', img.shape, bbox, center, self.input_size, _scale, rotate)
aimg, M = face_align.transform(img, center, self.input_size[0], _scale, rotate)
input_size = tuple(aimg.shape[0:2][::-1])
#assert input_size==self.input_size
blob = cv2.dnn.blobFromImage(aimg, 1.0 / self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
pred = self.session.run(self.output_names, {self.input_name: blob})[0][0]
if self.taskname == 'genderage':
assert len(pred) == 3
gender = np.argmax(pred[:2])
age = int(np.round(pred[2] * 100))
face['gender'] = gender
face['age'] = age
return gender, age
else:
return pred

View File

@ -0,0 +1,105 @@
import time
import numpy as np
import onnxruntime
import cv2
import onnx
from onnx import numpy_helper
from ..utils import face_align
class INSwapper():
def __init__(self, model_file=None, session=None):
self.model_file = model_file
self.session = session
model = onnx.load(self.model_file)
graph = model.graph
self.emap = numpy_helper.to_array(graph.initializer[-1])
self.input_mean = 0.0
self.input_std = 255.0
#print('input mean and std:', model_file, self.input_mean, self.input_std)
if self.session is None:
self.session = onnxruntime.InferenceSession(self.model_file, None)
inputs = self.session.get_inputs()
self.input_names = []
for inp in inputs:
self.input_names.append(inp.name)
outputs = self.session.get_outputs()
output_names = []
for out in outputs:
output_names.append(out.name)
self.output_names = output_names
assert len(self.output_names)==1
output_shape = outputs[0].shape
input_cfg = inputs[0]
input_shape = input_cfg.shape
self.input_shape = input_shape
print('inswapper-shape:', self.input_shape)
self.input_size = tuple(input_shape[2:4][::-1])
def forward(self, img, latent):
img = (img - self.input_mean) / self.input_std
pred = self.session.run(self.output_names, {self.input_names[0]: img, self.input_names[1]: latent})[0]
return pred
def get(self, img, target_face, source_face, paste_back=True):
aimg, M = face_align.norm_crop2(img, target_face.kps, self.input_size[0])
blob = cv2.dnn.blobFromImage(aimg, 1.0 / self.input_std, self.input_size,
(self.input_mean, self.input_mean, self.input_mean), swapRB=True)
latent = source_face.normed_embedding.reshape((1,-1))
latent = np.dot(latent, self.emap)
latent /= np.linalg.norm(latent)
pred = self.session.run(self.output_names, {self.input_names[0]: blob, self.input_names[1]: latent})[0]
#print(latent.shape, latent.dtype, pred.shape)
img_fake = pred.transpose((0,2,3,1))[0]
bgr_fake = np.clip(255 * img_fake, 0, 255).astype(np.uint8)[:,:,::-1]
if not paste_back:
return bgr_fake, M
else:
target_img = img
fake_diff = bgr_fake.astype(np.float32) - aimg.astype(np.float32)
fake_diff = np.abs(fake_diff).mean(axis=2)
fake_diff[:2,:] = 0
fake_diff[-2:,:] = 0
fake_diff[:,:2] = 0
fake_diff[:,-2:] = 0
IM = cv2.invertAffineTransform(M)
img_white = np.full((aimg.shape[0],aimg.shape[1]), 255, dtype=np.float32)
bgr_fake = cv2.warpAffine(bgr_fake, IM, (target_img.shape[1], target_img.shape[0]), borderValue=0.0)
img_white = cv2.warpAffine(img_white, IM, (target_img.shape[1], target_img.shape[0]), borderValue=0.0)
fake_diff = cv2.warpAffine(fake_diff, IM, (target_img.shape[1], target_img.shape[0]), borderValue=0.0)
img_white[img_white>20] = 255
fthresh = 10
fake_diff[fake_diff<fthresh] = 0
fake_diff[fake_diff>=fthresh] = 255
img_mask = img_white
mask_h_inds, mask_w_inds = np.where(img_mask==255)
mask_h = np.max(mask_h_inds) - np.min(mask_h_inds)
mask_w = np.max(mask_w_inds) - np.min(mask_w_inds)
mask_size = int(np.sqrt(mask_h*mask_w))
k = max(mask_size//10, 10)
#k = max(mask_size//20, 6)
#k = 6
kernel = np.ones((k,k),np.uint8)
img_mask = cv2.erode(img_mask,kernel,iterations = 1)
kernel = np.ones((2,2),np.uint8)
fake_diff = cv2.dilate(fake_diff,kernel,iterations = 1)
k = max(mask_size//20, 5)
#k = 3
#k = 3
kernel_size = (k, k)
blur_size = tuple(2*i+1 for i in kernel_size)
img_mask = cv2.GaussianBlur(img_mask, blur_size, 0)
k = 5
kernel_size = (k, k)
blur_size = tuple(2*i+1 for i in kernel_size)
fake_diff = cv2.GaussianBlur(fake_diff, blur_size, 0)
img_mask /= 255
fake_diff /= 255
#img_mask = fake_diff
img_mask = np.reshape(img_mask, [img_mask.shape[0],img_mask.shape[1],1])
fake_merged = img_mask * bgr_fake + (1-img_mask) * target_img.astype(np.float32)
fake_merged = fake_merged.astype(np.uint8)
return fake_merged

View File

@ -0,0 +1,112 @@
# -*- coding: utf-8 -*-
# @Organization : insightface.ai
# @Author : Jia Guo
# @Time : 2021-05-04
# @Function :
from __future__ import division
import numpy as np
import cv2
import onnx
import onnxruntime
from ..utils import face_align
from ..utils import transform
from ..data import get_object
__all__ = [
'Landmark',
]
class Landmark:
def __init__(self, model_file=None, session=None):
assert model_file is not None
self.model_file = model_file
self.session = session
find_sub = False
find_mul = False
model = onnx.load(self.model_file)
graph = model.graph
for nid, node in enumerate(graph.node[:8]):
#print(nid, node.name)
if node.name.startswith('Sub') or node.name.startswith('_minus'):
find_sub = True
if node.name.startswith('Mul') or node.name.startswith('_mul'):
find_mul = True
if nid < 3 and node.name == 'bn_data':
find_sub = True
find_mul = True
if find_sub and find_mul:
#mxnet arcface model
input_mean = 0.0
input_std = 1.0
else:
input_mean = 127.5
input_std = 128.0
self.input_mean = input_mean
self.input_std = input_std
#print('input mean and std:', model_file, self.input_mean, self.input_std)
if self.session is None:
self.session = onnxruntime.InferenceSession(self.model_file, None)
input_cfg = self.session.get_inputs()[0]
input_shape = input_cfg.shape
input_name = input_cfg.name
self.input_size = tuple(input_shape[2:4][::-1])
self.input_shape = input_shape
outputs = self.session.get_outputs()
output_names = []
for out in outputs:
output_names.append(out.name)
self.input_name = input_name
self.output_names = output_names
assert len(self.output_names) == 1
output_shape = outputs[0].shape
self.require_pose = False
#print('init output_shape:', output_shape)
if output_shape[1] == 3309:
self.lmk_dim = 3
self.lmk_num = 68
self.mean_lmk = get_object('meanshape_68.pkl')
self.require_pose = True
else:
self.lmk_dim = 2
self.lmk_num = output_shape[1] // self.lmk_dim
self.taskname = 'landmark_%dd_%d' % (self.lmk_dim, self.lmk_num)
def prepare(self, ctx_id, **kwargs):
if ctx_id < 0:
self.session.set_providers(['CUDAExecutionProvider'])
def get(self, img, face):
bbox = face.bbox
w, h = (bbox[2] - bbox[0]), (bbox[3] - bbox[1])
center = (bbox[2] + bbox[0]) / 2, (bbox[3] + bbox[1]) / 2
rotate = 0
_scale = self.input_size[0] / (max(w, h) * 1.5)
#print('param:', img.shape, bbox, center, self.input_size, _scale, rotate)
aimg, M = face_align.transform(img, center, self.input_size[0], _scale, rotate)
input_size = tuple(aimg.shape[0:2][::-1])
#assert input_size==self.input_size
blob = cv2.dnn.blobFromImage(aimg, 1.0 / self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
pred = self.session.run(self.output_names, {self.input_name: blob})[0][0]
if pred.shape[0] >= 3000:
pred = pred.reshape((-1, 3))
else:
pred = pred.reshape((-1, 2))
if self.lmk_num < pred.shape[0]:
pred = pred[self.lmk_num * -1:, :]
pred[:, 0:2] += 1
pred[:, 0:2] *= (self.input_size[0] // 2)
if pred.shape[1] == 3:
pred[:, 2] *= (self.input_size[0] // 2)
IM = cv2.invertAffineTransform(M)
pred = face_align.trans_points(pred, IM)
face[self.taskname] = pred
if self.require_pose:
P = transform.estimate_affine_matrix_3d23d(self.mean_lmk, pred)
s, R, t = transform.P2sRt(P)
rx, ry, rz = transform.matrix2angle(R)
pose = np.array([rx, ry, rz], dtype=np.float32)
face['pose'] = pose #pitch, yaw, roll
return pred

View File

@ -0,0 +1,103 @@
"""
This code file mainly comes from https://github.com/dmlc/gluon-cv/blob/master/gluoncv/model_zoo/model_store.py
"""
from __future__ import print_function
__all__ = ['get_model_file']
import os
import zipfile
import glob
from ..utils import download, check_sha1
_model_sha1 = {
name: checksum
for checksum, name in [
('95be21b58e29e9c1237f229dae534bd854009ce0', 'arcface_r100_v1'),
('', 'arcface_mfn_v1'),
('39fd1e087a2a2ed70a154ac01fecaa86c315d01b', 'retinaface_r50_v1'),
('2c9de8116d1f448fd1d4661f90308faae34c990a', 'retinaface_mnet025_v1'),
('0db1d07921d005e6c9a5b38e059452fc5645e5a4', 'retinaface_mnet025_v2'),
('7dd8111652b7aac2490c5dcddeb268e53ac643e6', 'genderage_v1'),
]
}
base_repo_url = 'https://insightface.ai/files/'
_url_format = '{repo_url}models/{file_name}.zip'
def short_hash(name):
if name not in _model_sha1:
raise ValueError(
'Pretrained model for {name} is not available.'.format(name=name))
return _model_sha1[name][:8]
def find_params_file(dir_path):
if not os.path.exists(dir_path):
return None
paths = glob.glob("%s/*.params" % dir_path)
if len(paths) == 0:
return None
paths = sorted(paths)
return paths[-1]
def get_model_file(name, root=os.path.join('~', '.insightface', 'models')):
r"""Return location for the pretrained on local file system.
This function will download from online model zoo when model cannot be found or has mismatch.
The root directory will be created if it doesn't exist.
Parameters
----------
name : str
Name of the model.
root : str, default '~/.mxnet/models'
Location for keeping the model parameters.
Returns
-------
file_path
Path to the requested pretrained model file.
"""
file_name = name
root = os.path.expanduser(root)
dir_path = os.path.join(root, name)
file_path = find_params_file(dir_path)
#file_path = os.path.join(root, file_name + '.params')
sha1_hash = _model_sha1[name]
if file_path is not None:
if check_sha1(file_path, sha1_hash):
return file_path
else:
print(
'Mismatch in the content of model file detected. Downloading again.'
)
else:
print('Model file is not found. Downloading.')
if not os.path.exists(root):
os.makedirs(root)
if not os.path.exists(dir_path):
os.makedirs(dir_path)
zip_file_path = os.path.join(root, file_name + '.zip')
repo_url = base_repo_url
if repo_url[-1] != '/':
repo_url = repo_url + '/'
download(_url_format.format(repo_url=repo_url, file_name=file_name),
path=zip_file_path,
overwrite=True)
with zipfile.ZipFile(zip_file_path) as zf:
zf.extractall(dir_path)
os.remove(zip_file_path)
file_path = find_params_file(dir_path)
if check_sha1(file_path, sha1_hash):
return file_path
else:
raise ValueError(
'Downloaded file has different hash. Please try again.')

View File

@ -0,0 +1,102 @@
# -*- coding: utf-8 -*-
# @Organization : insightface.ai
# @Author : Jia Guo
# @Time : 2021-05-04
# @Function :
import os
import os.path as osp
import glob
import onnxruntime
from .arcface_onnx import *
from .retinaface import *
#from .scrfd import *
from .landmark import *
from .attribute import Attribute
from .inswapper import INSwapper
from ..utils import download_onnx
__all__ = ['get_model']
class PickableInferenceSession(onnxruntime.InferenceSession):
# This is a wrapper to make the current InferenceSession class pickable.
def __init__(self, model_path, **kwargs):
super().__init__(model_path, **kwargs)
self.model_path = model_path
def __getstate__(self):
return {'model_path': self.model_path}
def __setstate__(self, values):
model_path = values['model_path']
self.__init__(model_path)
class ModelRouter:
def __init__(self, onnx_file):
self.onnx_file = onnx_file
def get_model(self, **kwargs):
session = PickableInferenceSession(self.onnx_file, **kwargs)
print(f'Applied providers: {session._providers}, with options: {session._provider_options}')
inputs = session.get_inputs()
input_cfg = inputs[0]
input_shape = input_cfg.shape
outputs = session.get_outputs()
if len(outputs) >= 5:
return RetinaFace(model_file=self.onnx_file, session=session)
elif input_shape[2] == 192 and input_shape[3] == 192:
return Landmark(model_file=self.onnx_file, session=session)
elif input_shape[2] == 96 and input_shape[3] == 96:
return Attribute(model_file=self.onnx_file, session=session)
elif len(inputs) == 2 and input_shape[2] == 128 and input_shape[3] == 128:
return INSwapper(model_file=self.onnx_file, session=session)
elif input_shape[2] == input_shape[3] and input_shape[2] >= 112 and input_shape[2] % 16 == 0:
return ArcFaceONNX(model_file=self.onnx_file, session=session)
else:
#raise RuntimeError('error on model routing')
return None
def find_onnx_file(dir_path):
if not os.path.exists(dir_path):
return None
paths = glob.glob("%s/*.onnx" % dir_path)
if len(paths) == 0:
return None
paths = sorted(paths)
return paths[-1]
def get_default_providers():
return ['CUDAExecutionProvider', 'CUDAExecutionProvider']
def get_default_provider_options():
return None
def get_model(name, **kwargs):
root = kwargs.get('root', '~/.insightface')
root = os.path.expanduser(root)
model_root = osp.join(root, 'models')
allow_download = kwargs.get('download', False)
download_zip = kwargs.get('download_zip', False)
if not name.endswith('.onnx'):
model_dir = os.path.join(model_root, name)
model_file = find_onnx_file(model_dir)
if model_file is None:
return None
else:
model_file = name
if not osp.exists(model_file) and allow_download:
model_file = download_onnx('models', model_file, root=root, download_zip=download_zip)
assert osp.exists(model_file), 'model_file %s should exist' % model_file
assert osp.isfile(model_file), 'model_file %s should be a file' % model_file
router = ModelRouter(model_file)
providers = kwargs.get('providers', get_default_providers())
provider_options = kwargs.get('provider_options', get_default_provider_options())
model = router.get_model(providers=providers, provider_options=provider_options)
return model

View File

@ -0,0 +1,299 @@
# -*- coding: utf-8 -*-
# @Organization : insightface.ai
# @Author : Jia Guo
# @Time : 2021-09-18
# @Function :
from __future__ import division
import datetime
import numpy as np
import onnx
import onnxruntime
import os
import os.path as osp
import cv2
import sys
def softmax(z):
assert len(z.shape) == 2
s = np.max(z, axis=1)
s = s[:, np.newaxis] # necessary step to do broadcasting
e_x = np.exp(z - s)
div = np.sum(e_x, axis=1)
div = div[:, np.newaxis] # dito
return e_x / div
def distance2bbox(points, distance, max_shape=None):
"""Decode distance prediction to bounding box.
Args:
points (Tensor): Shape (n, 2), [x, y].
distance (Tensor): Distance from the given point to 4
boundaries (left, top, right, bottom).
max_shape (tuple): Shape of the image.
Returns:
Tensor: Decoded bboxes.
"""
x1 = points[:, 0] - distance[:, 0]
y1 = points[:, 1] - distance[:, 1]
x2 = points[:, 0] + distance[:, 2]
y2 = points[:, 1] + distance[:, 3]
if max_shape is not None:
x1 = x1.clamp(min=0, max=max_shape[1])
y1 = y1.clamp(min=0, max=max_shape[0])
x2 = x2.clamp(min=0, max=max_shape[1])
y2 = y2.clamp(min=0, max=max_shape[0])
return np.stack([x1, y1, x2, y2], axis=-1)
def distance2kps(points, distance, max_shape=None):
"""Decode distance prediction to bounding box.
Args:
points (Tensor): Shape (n, 2), [x, y].
distance (Tensor): Distance from the given point to 4
boundaries (left, top, right, bottom).
max_shape (tuple): Shape of the image.
Returns:
Tensor: Decoded bboxes.
"""
preds = []
for i in range(0, distance.shape[1], 2):
px = points[:, i % 2] + distance[:, i]
py = points[:, i % 2 + 1] + distance[:, i + 1]
if max_shape is not None:
px = px.clamp(min=0, max=max_shape[1])
py = py.clamp(min=0, max=max_shape[0])
preds.append(px)
preds.append(py)
return np.stack(preds, axis=-1)
class RetinaFace:
def __init__(self, model_file=None, session=None):
import onnxruntime
self.model_file = model_file
self.session = session
self.taskname = 'detection'
if self.session is None:
assert self.model_file is not None
assert osp.exists(self.model_file)
self.session = onnxruntime.InferenceSession(self.model_file, None)
self.center_cache = {}
self.nms_thresh = 0.4
self.det_thresh = 0.5
self._init_vars()
def _init_vars(self):
input_cfg = self.session.get_inputs()[0]
input_shape = input_cfg.shape
#print(input_shape)
if isinstance(input_shape[2], str):
self.input_size = None
else:
self.input_size = tuple(input_shape[2:4][::-1])
#print('image_size:', self.image_size)
input_name = input_cfg.name
self.input_shape = input_shape
outputs = self.session.get_outputs()
output_names = []
for o in outputs:
output_names.append(o.name)
self.input_name = input_name
self.output_names = output_names
self.input_mean = 127.5
self.input_std = 128.0
#print(self.output_names)
#assert len(outputs)==10 or len(outputs)==15
self.use_kps = False
self._anchor_ratio = 1.0
self._num_anchors = 1
if len(outputs) == 6:
self.fmc = 3
self._feat_stride_fpn = [8, 16, 32]
self._num_anchors = 2
elif len(outputs) == 9:
self.fmc = 3
self._feat_stride_fpn = [8, 16, 32]
self._num_anchors = 2
self.use_kps = True
elif len(outputs) == 10:
self.fmc = 5
self._feat_stride_fpn = [8, 16, 32, 64, 128]
self._num_anchors = 1
elif len(outputs) == 15:
self.fmc = 5
self._feat_stride_fpn = [8, 16, 32, 64, 128]
self._num_anchors = 1
self.use_kps = True
def prepare(self, ctx_id, **kwargs):
if ctx_id < 0:
self.session.set_providers(['CUDAExecutionProvider'])
nms_thresh = kwargs.get('nms_thresh', None)
if nms_thresh is not None:
self.nms_thresh = nms_thresh
det_thresh = kwargs.get('det_thresh', None)
if det_thresh is not None:
self.det_thresh = det_thresh
input_size = kwargs.get('input_size', None)
if input_size is not None:
if self.input_size is not None:
print('warning: det_size is already set in detection model, ignore')
else:
self.input_size = input_size
def forward(self, img, threshold):
scores_list = []
bboxes_list = []
kpss_list = []
input_size = tuple(img.shape[0:2][::-1])
blob = cv2.dnn.blobFromImage(img, 1.0 / self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
net_outs = self.session.run(self.output_names, {self.input_name: blob})
input_height = blob.shape[2]
input_width = blob.shape[3]
fmc = self.fmc
for idx, stride in enumerate(self._feat_stride_fpn):
scores = net_outs[idx]
bbox_preds = net_outs[idx + fmc]
bbox_preds = bbox_preds * stride
if self.use_kps:
kps_preds = net_outs[idx + fmc * 2] * stride
height = input_height // stride
width = input_width // stride
K = height * width
key = (height, width, stride)
if key in self.center_cache:
anchor_centers = self.center_cache[key]
else:
#solution-1, c style:
#anchor_centers = np.zeros( (height, width, 2), dtype=np.float32 )
#for i in range(height):
# anchor_centers[i, :, 1] = i
#for i in range(width):
# anchor_centers[:, i, 0] = i
#solution-2:
#ax = np.arange(width, dtype=np.float32)
#ay = np.arange(height, dtype=np.float32)
#xv, yv = np.meshgrid(np.arange(width), np.arange(height))
#anchor_centers = np.stack([xv, yv], axis=-1).astype(np.float32)
#solution-3:
anchor_centers = np.stack(np.mgrid[:height, :width][::-1], axis=-1).astype(np.float32)
#print(anchor_centers.shape)
anchor_centers = (anchor_centers * stride).reshape((-1, 2))
if self._num_anchors > 1:
anchor_centers = np.stack([anchor_centers] * self._num_anchors, axis=1).reshape((-1, 2))
if len(self.center_cache) < 100:
self.center_cache[key] = anchor_centers
pos_inds = np.where(scores >= threshold)[0]
bboxes = distance2bbox(anchor_centers, bbox_preds)
pos_scores = scores[pos_inds]
pos_bboxes = bboxes[pos_inds]
scores_list.append(pos_scores)
bboxes_list.append(pos_bboxes)
if self.use_kps:
kpss = distance2kps(anchor_centers, kps_preds)
#kpss = kps_preds
kpss = kpss.reshape((kpss.shape[0], -1, 2))
pos_kpss = kpss[pos_inds]
kpss_list.append(pos_kpss)
return scores_list, bboxes_list, kpss_list
def detect(self, img, input_size=None, max_num=0, metric='default'):
assert input_size is not None or self.input_size is not None
input_size = self.input_size if input_size is None else input_size
im_ratio = float(img.shape[0]) / img.shape[1]
model_ratio = float(input_size[1]) / input_size[0]
if im_ratio > model_ratio:
new_height = input_size[1]
new_width = int(new_height / im_ratio)
else:
new_width = input_size[0]
new_height = int(new_width * im_ratio)
det_scale = float(new_height) / img.shape[0]
resized_img = cv2.resize(img, (new_width, new_height))
det_img = np.zeros((input_size[1], input_size[0], 3), dtype=np.uint8)
det_img[:new_height, :new_width, :] = resized_img
scores_list, bboxes_list, kpss_list = self.forward(det_img, self.det_thresh)
scores = np.vstack(scores_list)
scores_ravel = scores.ravel()
order = scores_ravel.argsort()[::-1]
bboxes = np.vstack(bboxes_list) / det_scale
if self.use_kps:
kpss = np.vstack(kpss_list) / det_scale
pre_det = np.hstack((bboxes, scores)).astype(np.float32, copy=False)
pre_det = pre_det[order, :]
keep = self.nms(pre_det)
det = pre_det[keep, :]
if self.use_kps:
kpss = kpss[order, :, :]
kpss = kpss[keep, :, :]
else:
kpss = None
if max_num > 0 and det.shape[0] > max_num:
area = (det[:, 2] - det[:, 0]) * (det[:, 3] - det[:, 1])
img_center = img.shape[0] // 2, img.shape[1] // 2
offsets = np.vstack([(det[:, 0] + det[:, 2]) / 2 - img_center[1], (det[:, 1] + det[:, 3]) / 2 - img_center[0]])
offset_dist_squared = np.sum(np.power(offsets, 2.0), 0)
if metric == 'max':
values = area
else:
values = area - offset_dist_squared * 2.0 # some extra weight on the centering
bindex = np.argsort(values)[::-1] # some extra weight on the centering
bindex = bindex[0:max_num]
det = det[bindex, :]
if kpss is not None:
kpss = kpss[bindex, :]
return det, kpss
def nms(self, dets):
thresh = self.nms_thresh
x1 = dets[:, 0]
y1 = dets[:, 1]
x2 = dets[:, 2]
y2 = dets[:, 3]
scores = dets[:, 4]
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
keep = []
while order.size > 0:
i = order[0]
keep.append(i)
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= thresh)[0]
order = order[inds + 1]
return keep
def get_retinaface(name, download=False, root='~/.insightface/models', **kwargs):
if not download:
assert os.path.exists(name)
return RetinaFace(name)
else:
from .model_store import get_model_file
_file = get_model_file("retinaface_%s" % name, root=root)
return retinaface(_file)

View File

@ -0,0 +1,347 @@
# -*- coding: utf-8 -*-
# @Organization : insightface.ai
# @Author : Jia Guo
# @Time : 2021-05-04
# @Function :
from __future__ import division
import datetime
import numpy as np
import onnx
import onnxruntime
import os
import os.path as osp
import cv2
import sys
def softmax(z):
assert len(z.shape) == 2
s = np.max(z, axis=1)
s = s[:, np.newaxis] # necessary step to do broadcasting
e_x = np.exp(z - s)
div = np.sum(e_x, axis=1)
div = div[:, np.newaxis] # dito
return e_x / div
def distance2bbox(points, distance, max_shape=None):
"""Decode distance prediction to bounding box.
Args:
points (Tensor): Shape (n, 2), [x, y].
distance (Tensor): Distance from the given point to 4
boundaries (left, top, right, bottom).
max_shape (tuple): Shape of the image.
Returns:
Tensor: Decoded bboxes.
"""
x1 = points[:, 0] - distance[:, 0]
y1 = points[:, 1] - distance[:, 1]
x2 = points[:, 0] + distance[:, 2]
y2 = points[:, 1] + distance[:, 3]
if max_shape is not None:
x1 = x1.clamp(min=0, max=max_shape[1])
y1 = y1.clamp(min=0, max=max_shape[0])
x2 = x2.clamp(min=0, max=max_shape[1])
y2 = y2.clamp(min=0, max=max_shape[0])
return np.stack([x1, y1, x2, y2], axis=-1)
def distance2kps(points, distance, max_shape=None):
"""Decode distance prediction to bounding box.
Args:
points (Tensor): Shape (n, 2), [x, y].
distance (Tensor): Distance from the given point to 4
boundaries (left, top, right, bottom).
max_shape (tuple): Shape of the image.
Returns:
Tensor: Decoded bboxes.
"""
preds = []
for i in range(0, distance.shape[1], 2):
px = points[:, i % 2] + distance[:, i]
py = points[:, i % 2 + 1] + distance[:, i + 1]
if max_shape is not None:
px = px.clamp(min=0, max=max_shape[1])
py = py.clamp(min=0, max=max_shape[0])
preds.append(px)
preds.append(py)
return np.stack(preds, axis=-1)
class SCRFD:
def __init__(self, model_file=None, session=None):
import onnxruntime
self.model_file = model_file
self.session = session
self.taskname = 'detection'
self.batched = False
if self.session is None:
assert self.model_file is not None
assert osp.exists(self.model_file)
self.session = onnxruntime.InferenceSession(self.model_file, None)
self.center_cache = {}
self.nms_thresh = 0.4
self.det_thresh = 0.5
self._init_vars()
def _init_vars(self):
input_cfg = self.session.get_inputs()[0]
input_shape = input_cfg.shape
#print(input_shape)
if isinstance(input_shape[2], str):
self.input_size = None
else:
self.input_size = tuple(input_shape[2:4][::-1])
#print('image_size:', self.image_size)
input_name = input_cfg.name
self.input_shape = input_shape
outputs = self.session.get_outputs()
if len(outputs[0].shape) == 3:
self.batched = True
output_names = []
for o in outputs:
output_names.append(o.name)
self.input_name = input_name
self.output_names = output_names
self.input_mean = 127.5
self.input_std = 128.0
#print(self.output_names)
#assert len(outputs)==10 or len(outputs)==15
self.use_kps = False
self._anchor_ratio = 1.0
self._num_anchors = 1
if len(outputs) == 6:
self.fmc = 3
self._feat_stride_fpn = [8, 16, 32]
self._num_anchors = 2
elif len(outputs) == 9:
self.fmc = 3
self._feat_stride_fpn = [8, 16, 32]
self._num_anchors = 2
self.use_kps = True
elif len(outputs) == 10:
self.fmc = 5
self._feat_stride_fpn = [8, 16, 32, 64, 128]
self._num_anchors = 1
elif len(outputs) == 15:
self.fmc = 5
self._feat_stride_fpn = [8, 16, 32, 64, 128]
self._num_anchors = 1
self.use_kps = True
def prepare(self, ctx_id, **kwargs):
if ctx_id < 0:
self.session.set_providers(['CUDAExecutionProvider'])
nms_thresh = kwargs.get('nms_thresh', None)
if nms_thresh is not None:
self.nms_thresh = nms_thresh
det_thresh = kwargs.get('det_thresh', None)
if det_thresh is not None:
self.det_thresh = det_thresh
input_size = kwargs.get('input_size', None)
if input_size is not None:
if self.input_size is not None:
print('warning: det_size is already set in scrfd model, ignore')
else:
self.input_size = input_size
def forward(self, img, threshold):
scores_list = []
bboxes_list = []
kpss_list = []
input_size = tuple(img.shape[0:2][::-1])
blob = cv2.dnn.blobFromImage(img, 1.0 / self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
net_outs = self.session.run(self.output_names, {self.input_name: blob})
input_height = blob.shape[2]
input_width = blob.shape[3]
fmc = self.fmc
for idx, stride in enumerate(self._feat_stride_fpn):
# If model support batch dim, take first output
if self.batched:
scores = net_outs[idx][0]
bbox_preds = net_outs[idx + fmc][0]
bbox_preds = bbox_preds * stride
if self.use_kps:
kps_preds = net_outs[idx + fmc * 2][0] * stride
# If model doesn't support batching take output as is
else:
scores = net_outs[idx]
bbox_preds = net_outs[idx + fmc]
bbox_preds = bbox_preds * stride
if self.use_kps:
kps_preds = net_outs[idx + fmc * 2] * stride
height = input_height // stride
width = input_width // stride
K = height * width
key = (height, width, stride)
if key in self.center_cache:
anchor_centers = self.center_cache[key]
else:
#solution-1, c style:
#anchor_centers = np.zeros( (height, width, 2), dtype=np.float32 )
#for i in range(height):
# anchor_centers[i, :, 1] = i
#for i in range(width):
# anchor_centers[:, i, 0] = i
#solution-2:
#ax = np.arange(width, dtype=np.float32)
#ay = np.arange(height, dtype=np.float32)
#xv, yv = np.meshgrid(np.arange(width), np.arange(height))
#anchor_centers = np.stack([xv, yv], axis=-1).astype(np.float32)
#solution-3:
anchor_centers = np.stack(np.mgrid[:height, :width][::-1], axis=-1).astype(np.float32)
#print(anchor_centers.shape)
anchor_centers = (anchor_centers * stride).reshape((-1, 2))
if self._num_anchors > 1:
anchor_centers = np.stack([anchor_centers] * self._num_anchors, axis=1).reshape((-1, 2))
if len(self.center_cache) < 100:
self.center_cache[key] = anchor_centers
pos_inds = np.where(scores >= threshold)[0]
bboxes = distance2bbox(anchor_centers, bbox_preds)
pos_scores = scores[pos_inds]
pos_bboxes = bboxes[pos_inds]
scores_list.append(pos_scores)
bboxes_list.append(pos_bboxes)
if self.use_kps:
kpss = distance2kps(anchor_centers, kps_preds)
#kpss = kps_preds
kpss = kpss.reshape((kpss.shape[0], -1, 2))
pos_kpss = kpss[pos_inds]
kpss_list.append(pos_kpss)
return scores_list, bboxes_list, kpss_list
def detect(self, img, input_size=None, max_num=0, metric='default'):
assert input_size is not None or self.input_size is not None
input_size = self.input_size if input_size is None else input_size
im_ratio = float(img.shape[0]) / img.shape[1]
model_ratio = float(input_size[1]) / input_size[0]
if im_ratio > model_ratio:
new_height = input_size[1]
new_width = int(new_height / im_ratio)
else:
new_width = input_size[0]
new_height = int(new_width * im_ratio)
det_scale = float(new_height) / img.shape[0]
resized_img = cv2.resize(img, (new_width, new_height))
det_img = np.zeros((input_size[1], input_size[0], 3), dtype=np.uint8)
det_img[:new_height, :new_width, :] = resized_img
scores_list, bboxes_list, kpss_list = self.forward(det_img, self.det_thresh)
scores = np.vstack(scores_list)
scores_ravel = scores.ravel()
order = scores_ravel.argsort()[::-1]
bboxes = np.vstack(bboxes_list) / det_scale
if self.use_kps:
kpss = np.vstack(kpss_list) / det_scale
pre_det = np.hstack((bboxes, scores)).astype(np.float32, copy=False)
pre_det = pre_det[order, :]
keep = self.nms(pre_det)
det = pre_det[keep, :]
if self.use_kps:
kpss = kpss[order, :, :]
kpss = kpss[keep, :, :]
else:
kpss = None
if max_num > 0 and det.shape[0] > max_num:
area = (det[:, 2] - det[:, 0]) * (det[:, 3] - det[:, 1])
img_center = img.shape[0] // 2, img.shape[1] // 2
offsets = np.vstack([(det[:, 0] + det[:, 2]) / 2 - img_center[1], (det[:, 1] + det[:, 3]) / 2 - img_center[0]])
offset_dist_squared = np.sum(np.power(offsets, 2.0), 0)
if metric == 'max':
values = area
else:
values = area - offset_dist_squared * 2.0 # some extra weight on the centering
bindex = np.argsort(values)[::-1] # some extra weight on the centering
bindex = bindex[0:max_num]
det = det[bindex, :]
if kpss is not None:
kpss = kpss[bindex, :]
return det, kpss
def nms(self, dets):
thresh = self.nms_thresh
x1 = dets[:, 0]
y1 = dets[:, 1]
x2 = dets[:, 2]
y2 = dets[:, 3]
scores = dets[:, 4]
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
order = scores.argsort()[::-1]
keep = []
while order.size > 0:
i = order[0]
keep.append(i)
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= thresh)[0]
order = order[inds + 1]
return keep
def get_scrfd(name, download=False, root='~/.insightface/models', **kwargs):
if not download:
assert os.path.exists(name)
return SCRFD(name)
else:
from .model_store import get_model_file
_file = get_model_file("scrfd_%s" % name, root=root)
return SCRFD(_file)
def scrfd_2p5gkps(**kwargs):
return get_scrfd("2p5gkps", download=True, **kwargs)
if __name__ == '__main__':
import glob
detector = SCRFD(model_file='./det.onnx')
detector.prepare(-1)
img_paths = ['tests/data/t1.jpg']
for img_path in img_paths:
img = cv2.imread(img_path)
for _ in range(1):
ta = datetime.datetime.now()
#bboxes, kpss = detector.detect(img, 0.5, input_size = (640, 640))
bboxes, kpss = detector.detect(img, 0.5)
tb = datetime.datetime.now()
print('all cost:', (tb - ta).total_seconds() * 1000)
print(img_path, bboxes.shape)
if kpss is not None:
print(kpss.shape)
for i in range(bboxes.shape[0]):
bbox = bboxes[i]
x1, y1, x2, y2, score = bbox.astype(np.int)
cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 0), 2)
if kpss is not None:
kps = kpss[i]
for kp in kps:
kp = kp.astype(np.int)
cv2.circle(img, tuple(kp), 1, (0, 0, 255), 2)
filename = img_path.split('/')[-1]
print('output:', filename)
cv2.imwrite('./outputs/%s' % filename, img)

View File

View File

@ -0,0 +1,4 @@
#import mesh
#import morphable_model
from . import mesh
from . import morphable_model

View File

@ -0,0 +1,15 @@
#from __future__ import absolute_import
#from cython import mesh_core_cython
#import io
#import vis
#import transform
#import light
#import render
# from .cython import mesh_core_cython
# from . import io
# from . import vis
# from . import transform
# from . import light
# from . import render

View File

@ -0,0 +1,375 @@
/*
functions that can not be optimazed by vertorization in python.
1. rasterization.(need process each triangle)
2. normal of each vertex.(use one-ring, need process each vertex)
3. write obj(seems that it can be verctorized? anyway, writing it in c++ is simple, so also add function here. --> however, why writting in c++ is still slow?)
Author: Yao Feng
Mail: yaofeng1995@gmail.com
*/
#include "mesh_core.h"
/* Judge whether the point is in the triangle
Method:
http://blackpawn.com/texts/pointinpoly/
Args:
point: [x, y]
tri_points: three vertices(2d points) of a triangle. 2 coords x 3 vertices
Returns:
bool: true for in triangle
*/
bool isPointInTri(point p, point p0, point p1, point p2)
{
// vectors
point v0, v1, v2;
v0 = p2 - p0;
v1 = p1 - p0;
v2 = p - p0;
// dot products
float dot00 = v0.dot(v0); //v0.x * v0.x + v0.y * v0.y //np.dot(v0.T, v0)
float dot01 = v0.dot(v1); //v0.x * v1.x + v0.y * v1.y //np.dot(v0.T, v1)
float dot02 = v0.dot(v2); //v0.x * v2.x + v0.y * v2.y //np.dot(v0.T, v2)
float dot11 = v1.dot(v1); //v1.x * v1.x + v1.y * v1.y //np.dot(v1.T, v1)
float dot12 = v1.dot(v2); //v1.x * v2.x + v1.y * v2.y//np.dot(v1.T, v2)
// barycentric coordinates
float inverDeno;
if(dot00*dot11 - dot01*dot01 == 0)
inverDeno = 0;
else
inverDeno = 1/(dot00*dot11 - dot01*dot01);
float u = (dot11*dot02 - dot01*dot12)*inverDeno;
float v = (dot00*dot12 - dot01*dot02)*inverDeno;
// check if point in triangle
return (u >= 0) && (v >= 0) && (u + v < 1);
}
void get_point_weight(float* weight, point p, point p0, point p1, point p2)
{
// vectors
point v0, v1, v2;
v0 = p2 - p0;
v1 = p1 - p0;
v2 = p - p0;
// dot products
float dot00 = v0.dot(v0); //v0.x * v0.x + v0.y * v0.y //np.dot(v0.T, v0)
float dot01 = v0.dot(v1); //v0.x * v1.x + v0.y * v1.y //np.dot(v0.T, v1)
float dot02 = v0.dot(v2); //v0.x * v2.x + v0.y * v2.y //np.dot(v0.T, v2)
float dot11 = v1.dot(v1); //v1.x * v1.x + v1.y * v1.y //np.dot(v1.T, v1)
float dot12 = v1.dot(v2); //v1.x * v2.x + v1.y * v2.y//np.dot(v1.T, v2)
// barycentric coordinates
float inverDeno;
if(dot00*dot11 - dot01*dot01 == 0)
inverDeno = 0;
else
inverDeno = 1/(dot00*dot11 - dot01*dot01);
float u = (dot11*dot02 - dot01*dot12)*inverDeno;
float v = (dot00*dot12 - dot01*dot02)*inverDeno;
// weight
weight[0] = 1 - u - v;
weight[1] = v;
weight[2] = u;
}
void _get_normal_core(
float* normal, float* tri_normal, int* triangles,
int ntri)
{
int i, j;
int tri_p0_ind, tri_p1_ind, tri_p2_ind;
for(i = 0; i < ntri; i++)
{
tri_p0_ind = triangles[3*i];
tri_p1_ind = triangles[3*i + 1];
tri_p2_ind = triangles[3*i + 2];
for(j = 0; j < 3; j++)
{
normal[3*tri_p0_ind + j] = normal[3*tri_p0_ind + j] + tri_normal[3*i + j];
normal[3*tri_p1_ind + j] = normal[3*tri_p1_ind + j] + tri_normal[3*i + j];
normal[3*tri_p2_ind + j] = normal[3*tri_p2_ind + j] + tri_normal[3*i + j];
}
}
}
void _rasterize_triangles_core(
float* vertices, int* triangles,
float* depth_buffer, int* triangle_buffer, float* barycentric_weight,
int nver, int ntri,
int h, int w)
{
int i;
int x, y, k;
int tri_p0_ind, tri_p1_ind, tri_p2_ind;
point p0, p1, p2, p;
int x_min, x_max, y_min, y_max;
float p_depth, p0_depth, p1_depth, p2_depth;
float weight[3];
for(i = 0; i < ntri; i++)
{
tri_p0_ind = triangles[3*i];
tri_p1_ind = triangles[3*i + 1];
tri_p2_ind = triangles[3*i + 2];
p0.x = vertices[3*tri_p0_ind]; p0.y = vertices[3*tri_p0_ind + 1]; p0_depth = vertices[3*tri_p0_ind + 2];
p1.x = vertices[3*tri_p1_ind]; p1.y = vertices[3*tri_p1_ind + 1]; p1_depth = vertices[3*tri_p1_ind + 2];
p2.x = vertices[3*tri_p2_ind]; p2.y = vertices[3*tri_p2_ind + 1]; p2_depth = vertices[3*tri_p2_ind + 2];
x_min = max((int)ceil(min(p0.x, min(p1.x, p2.x))), 0);
x_max = min((int)floor(max(p0.x, max(p1.x, p2.x))), w - 1);
y_min = max((int)ceil(min(p0.y, min(p1.y, p2.y))), 0);
y_max = min((int)floor(max(p0.y, max(p1.y, p2.y))), h - 1);
if(x_max < x_min || y_max < y_min)
{
continue;
}
for(y = y_min; y <= y_max; y++) //h
{
for(x = x_min; x <= x_max; x++) //w
{
p.x = x; p.y = y;
if(p.x < 2 || p.x > w - 3 || p.y < 2 || p.y > h - 3 || isPointInTri(p, p0, p1, p2))
{
get_point_weight(weight, p, p0, p1, p2);
p_depth = weight[0]*p0_depth + weight[1]*p1_depth + weight[2]*p2_depth;
if((p_depth > depth_buffer[y*w + x]))
{
depth_buffer[y*w + x] = p_depth;
triangle_buffer[y*w + x] = i;
for(k = 0; k < 3; k++)
{
barycentric_weight[y*w*3 + x*3 + k] = weight[k];
}
}
}
}
}
}
}
void _render_colors_core(
float* image, float* vertices, int* triangles,
float* colors,
float* depth_buffer,
int nver, int ntri,
int h, int w, int c)
{
int i;
int x, y, k;
int tri_p0_ind, tri_p1_ind, tri_p2_ind;
point p0, p1, p2, p;
int x_min, x_max, y_min, y_max;
float p_depth, p0_depth, p1_depth, p2_depth;
float p_color, p0_color, p1_color, p2_color;
float weight[3];
for(i = 0; i < ntri; i++)
{
tri_p0_ind = triangles[3*i];
tri_p1_ind = triangles[3*i + 1];
tri_p2_ind = triangles[3*i + 2];
p0.x = vertices[3*tri_p0_ind]; p0.y = vertices[3*tri_p0_ind + 1]; p0_depth = vertices[3*tri_p0_ind + 2];
p1.x = vertices[3*tri_p1_ind]; p1.y = vertices[3*tri_p1_ind + 1]; p1_depth = vertices[3*tri_p1_ind + 2];
p2.x = vertices[3*tri_p2_ind]; p2.y = vertices[3*tri_p2_ind + 1]; p2_depth = vertices[3*tri_p2_ind + 2];
x_min = max((int)ceil(min(p0.x, min(p1.x, p2.x))), 0);
x_max = min((int)floor(max(p0.x, max(p1.x, p2.x))), w - 1);
y_min = max((int)ceil(min(p0.y, min(p1.y, p2.y))), 0);
y_max = min((int)floor(max(p0.y, max(p1.y, p2.y))), h - 1);
if(x_max < x_min || y_max < y_min)
{
continue;
}
for(y = y_min; y <= y_max; y++) //h
{
for(x = x_min; x <= x_max; x++) //w
{
p.x = x; p.y = y;
if(p.x < 2 || p.x > w - 3 || p.y < 2 || p.y > h - 3 || isPointInTri(p, p0, p1, p2))
{
get_point_weight(weight, p, p0, p1, p2);
p_depth = weight[0]*p0_depth + weight[1]*p1_depth + weight[2]*p2_depth;
if((p_depth > depth_buffer[y*w + x]))
{
for(k = 0; k < c; k++) // c
{
p0_color = colors[c*tri_p0_ind + k];
p1_color = colors[c*tri_p1_ind + k];
p2_color = colors[c*tri_p2_ind + k];
p_color = weight[0]*p0_color + weight[1]*p1_color + weight[2]*p2_color;
image[y*w*c + x*c + k] = p_color;
}
depth_buffer[y*w + x] = p_depth;
}
}
}
}
}
}
void _render_texture_core(
float* image, float* vertices, int* triangles,
float* texture, float* tex_coords, int* tex_triangles,
float* depth_buffer,
int nver, int tex_nver, int ntri,
int h, int w, int c,
int tex_h, int tex_w, int tex_c,
int mapping_type)
{
int i;
int x, y, k;
int tri_p0_ind, tri_p1_ind, tri_p2_ind;
int tex_tri_p0_ind, tex_tri_p1_ind, tex_tri_p2_ind;
point p0, p1, p2, p;
point tex_p0, tex_p1, tex_p2, tex_p;
int x_min, x_max, y_min, y_max;
float weight[3];
float p_depth, p0_depth, p1_depth, p2_depth;
float xd, yd;
float ul, ur, dl, dr;
for(i = 0; i < ntri; i++)
{
// mesh
tri_p0_ind = triangles[3*i];
tri_p1_ind = triangles[3*i + 1];
tri_p2_ind = triangles[3*i + 2];
p0.x = vertices[3*tri_p0_ind]; p0.y = vertices[3*tri_p0_ind + 1]; p0_depth = vertices[3*tri_p0_ind + 2];
p1.x = vertices[3*tri_p1_ind]; p1.y = vertices[3*tri_p1_ind + 1]; p1_depth = vertices[3*tri_p1_ind + 2];
p2.x = vertices[3*tri_p2_ind]; p2.y = vertices[3*tri_p2_ind + 1]; p2_depth = vertices[3*tri_p2_ind + 2];
// texture
tex_tri_p0_ind = tex_triangles[3*i];
tex_tri_p1_ind = tex_triangles[3*i + 1];
tex_tri_p2_ind = tex_triangles[3*i + 2];
tex_p0.x = tex_coords[3*tex_tri_p0_ind]; tex_p0.y = tex_coords[3*tri_p0_ind + 1];
tex_p1.x = tex_coords[3*tex_tri_p1_ind]; tex_p1.y = tex_coords[3*tri_p1_ind + 1];
tex_p2.x = tex_coords[3*tex_tri_p2_ind]; tex_p2.y = tex_coords[3*tri_p2_ind + 1];
x_min = max((int)ceil(min(p0.x, min(p1.x, p2.x))), 0);
x_max = min((int)floor(max(p0.x, max(p1.x, p2.x))), w - 1);
y_min = max((int)ceil(min(p0.y, min(p1.y, p2.y))), 0);
y_max = min((int)floor(max(p0.y, max(p1.y, p2.y))), h - 1);
if(x_max < x_min || y_max < y_min)
{
continue;
}
for(y = y_min; y <= y_max; y++) //h
{
for(x = x_min; x <= x_max; x++) //w
{
p.x = x; p.y = y;
if(p.x < 2 || p.x > w - 3 || p.y < 2 || p.y > h - 3 || isPointInTri(p, p0, p1, p2))
{
get_point_weight(weight, p, p0, p1, p2);
p_depth = weight[0]*p0_depth + weight[1]*p1_depth + weight[2]*p2_depth;
if((p_depth > depth_buffer[y*w + x]))
{
// -- color from texture
// cal weight in mesh tri
get_point_weight(weight, p, p0, p1, p2);
// cal coord in texture
tex_p = tex_p0*weight[0] + tex_p1*weight[1] + tex_p2*weight[2];
tex_p.x = max(min(tex_p.x, float(tex_w - 1)), float(0));
tex_p.y = max(min(tex_p.y, float(tex_h - 1)), float(0));
yd = tex_p.y - floor(tex_p.y);
xd = tex_p.x - floor(tex_p.x);
for(k = 0; k < c; k++)
{
if(mapping_type==0)// nearest
{
image[y*w*c + x*c + k] = texture[int(round(tex_p.y))*tex_w*tex_c + int(round(tex_p.x))*tex_c + k];
}
else//bilinear interp
{
ul = texture[(int)floor(tex_p.y)*tex_w*tex_c + (int)floor(tex_p.x)*tex_c + k];
ur = texture[(int)floor(tex_p.y)*tex_w*tex_c + (int)ceil(tex_p.x)*tex_c + k];
dl = texture[(int)ceil(tex_p.y)*tex_w*tex_c + (int)floor(tex_p.x)*tex_c + k];
dr = texture[(int)ceil(tex_p.y)*tex_w*tex_c + (int)ceil(tex_p.x)*tex_c + k];
image[y*w*c + x*c + k] = ul*(1-xd)*(1-yd) + ur*xd*(1-yd) + dl*(1-xd)*yd + dr*xd*yd;
}
}
depth_buffer[y*w + x] = p_depth;
}
}
}
}
}
}
// ------------------------------------------------- write
// obj write
// Ref: https://github.com/patrikhuber/eos/blob/master/include/eos/core/Mesh.hpp
void _write_obj_with_colors_texture(string filename, string mtl_name,
float* vertices, int* triangles, float* colors, float* uv_coords,
int nver, int ntri, int ntexver)
{
int i;
ofstream obj_file(filename.c_str());
// first line of the obj file: the mtl name
obj_file << "mtllib " << mtl_name << endl;
// write vertices
for (i = 0; i < nver; ++i)
{
obj_file << "v " << vertices[3*i] << " " << vertices[3*i + 1] << " " << vertices[3*i + 2] << colors[3*i] << " " << colors[3*i + 1] << " " << colors[3*i + 2] << endl;
}
// write uv coordinates
for (i = 0; i < ntexver; ++i)
{
//obj_file << "vt " << uv_coords[2*i] << " " << (1 - uv_coords[2*i + 1]) << endl;
obj_file << "vt " << uv_coords[2*i] << " " << uv_coords[2*i + 1] << endl;
}
obj_file << "usemtl FaceTexture" << endl;
// write triangles
for (i = 0; i < ntri; ++i)
{
// obj_file << "f " << triangles[3*i] << "/" << triangles[3*i] << " " << triangles[3*i + 1] << "/" << triangles[3*i + 1] << " " << triangles[3*i + 2] << "/" << triangles[3*i + 2] << endl;
obj_file << "f " << triangles[3*i + 2] << "/" << triangles[3*i + 2] << " " << triangles[3*i + 1] << "/" << triangles[3*i + 1] << " " << triangles[3*i] << "/" << triangles[3*i] << endl;
}
}

View File

@ -0,0 +1,83 @@
#ifndef MESH_CORE_HPP_
#define MESH_CORE_HPP_
#include <stdio.h>
#include <cmath>
#include <algorithm>
#include <string>
#include <iostream>
#include <fstream>
using namespace std;
class point
{
public:
float x;
float y;
float dot(point p)
{
return this->x * p.x + this->y * p.y;
}
point operator-(const point& p)
{
point np;
np.x = this->x - p.x;
np.y = this->y - p.y;
return np;
}
point operator+(const point& p)
{
point np;
np.x = this->x + p.x;
np.y = this->y + p.y;
return np;
}
point operator*(float s)
{
point np;
np.x = s * this->x;
np.y = s * this->y;
return np;
}
};
bool isPointInTri(point p, point p0, point p1, point p2, int h, int w);
void get_point_weight(float* weight, point p, point p0, point p1, point p2);
void _get_normal_core(
float* normal, float* tri_normal, int* triangles,
int ntri);
void _rasterize_triangles_core(
float* vertices, int* triangles,
float* depth_buffer, int* triangle_buffer, float* barycentric_weight,
int nver, int ntri,
int h, int w);
void _render_colors_core(
float* image, float* vertices, int* triangles,
float* colors,
float* depth_buffer,
int nver, int ntri,
int h, int w, int c);
void _render_texture_core(
float* image, float* vertices, int* triangles,
float* texture, float* tex_coords, int* tex_triangles,
float* depth_buffer,
int nver, int tex_nver, int ntri,
int h, int w, int c,
int tex_h, int tex_w, int tex_c,
int mapping_type);
void _write_obj_with_colors_texture(string filename, string mtl_name,
float* vertices, int* triangles, float* colors, float* uv_coords,
int nver, int ntri, int ntexver);
#endif

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,109 @@
import numpy as np
cimport numpy as np
from libcpp.string cimport string
# use the Numpy-C-API from Cython
np.import_array()
# cdefine the signature of our c function
cdef extern from "mesh_core.h":
void _rasterize_triangles_core(
float* vertices, int* triangles,
float* depth_buffer, int* triangle_buffer, float* barycentric_weight,
int nver, int ntri,
int h, int w)
void _render_colors_core(
float* image, float* vertices, int* triangles,
float* colors,
float* depth_buffer,
int nver, int ntri,
int h, int w, int c)
void _render_texture_core(
float* image, float* vertices, int* triangles,
float* texture, float* tex_coords, int* tex_triangles,
float* depth_buffer,
int nver, int tex_nver, int ntri,
int h, int w, int c,
int tex_h, int tex_w, int tex_c,
int mapping_type)
void _get_normal_core(
float* normal, float* tri_normal, int* triangles,
int ntri)
void _write_obj_with_colors_texture(string filename, string mtl_name,
float* vertices, int* triangles, float* colors, float* uv_coords,
int nver, int ntri, int ntexver)
def get_normal_core(np.ndarray[float, ndim=2, mode = "c"] normal not None,
np.ndarray[float, ndim=2, mode = "c"] tri_normal not None,
np.ndarray[int, ndim=2, mode="c"] triangles not None,
int ntri
):
_get_normal_core(
<float*> np.PyArray_DATA(normal), <float*> np.PyArray_DATA(tri_normal), <int*> np.PyArray_DATA(triangles),
ntri)
def rasterize_triangles_core(
np.ndarray[float, ndim=2, mode = "c"] vertices not None,
np.ndarray[int, ndim=2, mode="c"] triangles not None,
np.ndarray[float, ndim=2, mode = "c"] depth_buffer not None,
np.ndarray[int, ndim=2, mode = "c"] triangle_buffer not None,
np.ndarray[float, ndim=2, mode = "c"] barycentric_weight not None,
int nver, int ntri,
int h, int w
):
_rasterize_triangles_core(
<float*> np.PyArray_DATA(vertices), <int*> np.PyArray_DATA(triangles),
<float*> np.PyArray_DATA(depth_buffer), <int*> np.PyArray_DATA(triangle_buffer), <float*> np.PyArray_DATA(barycentric_weight),
nver, ntri,
h, w)
def render_colors_core(np.ndarray[float, ndim=3, mode = "c"] image not None,
np.ndarray[float, ndim=2, mode = "c"] vertices not None,
np.ndarray[int, ndim=2, mode="c"] triangles not None,
np.ndarray[float, ndim=2, mode = "c"] colors not None,
np.ndarray[float, ndim=2, mode = "c"] depth_buffer not None,
int nver, int ntri,
int h, int w, int c
):
_render_colors_core(
<float*> np.PyArray_DATA(image), <float*> np.PyArray_DATA(vertices), <int*> np.PyArray_DATA(triangles),
<float*> np.PyArray_DATA(colors),
<float*> np.PyArray_DATA(depth_buffer),
nver, ntri,
h, w, c)
def render_texture_core(np.ndarray[float, ndim=3, mode = "c"] image not None,
np.ndarray[float, ndim=2, mode = "c"] vertices not None,
np.ndarray[int, ndim=2, mode="c"] triangles not None,
np.ndarray[float, ndim=3, mode = "c"] texture not None,
np.ndarray[float, ndim=2, mode = "c"] tex_coords not None,
np.ndarray[int, ndim=2, mode="c"] tex_triangles not None,
np.ndarray[float, ndim=2, mode = "c"] depth_buffer not None,
int nver, int tex_nver, int ntri,
int h, int w, int c,
int tex_h, int tex_w, int tex_c,
int mapping_type
):
_render_texture_core(
<float*> np.PyArray_DATA(image), <float*> np.PyArray_DATA(vertices), <int*> np.PyArray_DATA(triangles),
<float*> np.PyArray_DATA(texture), <float*> np.PyArray_DATA(tex_coords), <int*> np.PyArray_DATA(tex_triangles),
<float*> np.PyArray_DATA(depth_buffer),
nver, tex_nver, ntri,
h, w, c,
tex_h, tex_w, tex_c,
mapping_type)
def write_obj_with_colors_texture_core(string filename, string mtl_name,
np.ndarray[float, ndim=2, mode = "c"] vertices not None,
np.ndarray[int, ndim=2, mode="c"] triangles not None,
np.ndarray[float, ndim=2, mode = "c"] colors not None,
np.ndarray[float, ndim=2, mode = "c"] uv_coords not None,
int nver, int ntri, int ntexver
):
_write_obj_with_colors_texture(filename, mtl_name,
<float*> np.PyArray_DATA(vertices), <int*> np.PyArray_DATA(triangles), <float*> np.PyArray_DATA(colors), <float*> np.PyArray_DATA(uv_coords),
nver, ntri, ntexver)

View File

@ -0,0 +1,20 @@
'''
python setup.py build_ext -i
to compile
'''
# setup.py
from distutils.core import setup, Extension
from Cython.Build import cythonize
from Cython.Distutils import build_ext
import numpy
setup(
name = 'mesh_core_cython',
cmdclass={'build_ext': build_ext},
ext_modules=[Extension("mesh_core_cython",
sources=["mesh_core_cython.pyx", "mesh_core.cpp"],
language='c++',
include_dirs=[numpy.get_include()])],
)

View File

@ -0,0 +1,142 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import os
from skimage import io
from time import time
from .cython import mesh_core_cython
## TODO
## TODO: c++ version
def read_obj(obj_name):
''' read mesh
'''
return 0
# ------------------------- write
def write_asc(path, vertices):
'''
Args:
vertices: shape = (nver, 3)
'''
if path.split('.')[-1] == 'asc':
np.savetxt(path, vertices)
else:
np.savetxt(path + '.asc', vertices)
def write_obj_with_colors(obj_name, vertices, triangles, colors):
''' Save 3D face model with texture represented by colors.
Args:
obj_name: str
vertices: shape = (nver, 3)
triangles: shape = (ntri, 3)
colors: shape = (nver, 3)
'''
triangles = triangles.copy()
triangles += 1 # meshlab start with 1
if obj_name.split('.')[-1] != 'obj':
obj_name = obj_name + '.obj'
# write obj
with open(obj_name, 'w') as f:
# write vertices & colors
for i in range(vertices.shape[0]):
# s = 'v {} {} {} \n'.format(vertices[0,i], vertices[1,i], vertices[2,i])
s = 'v {} {} {} {} {} {}\n'.format(vertices[i, 0], vertices[i, 1], vertices[i, 2], colors[i, 0], colors[i, 1], colors[i, 2])
f.write(s)
# write f: ver ind/ uv ind
[k, ntri] = triangles.shape
for i in range(triangles.shape[0]):
# s = 'f {} {} {}\n'.format(triangles[i, 0], triangles[i, 1], triangles[i, 2])
s = 'f {} {} {}\n'.format(triangles[i, 2], triangles[i, 1], triangles[i, 0])
f.write(s)
## TODO: c++ version
def write_obj_with_texture(obj_name, vertices, triangles, texture, uv_coords):
''' Save 3D face model with texture represented by texture map.
Ref: https://github.com/patrikhuber/eos/blob/bd00155ebae4b1a13b08bf5a991694d682abbada/include/eos/core/Mesh.hpp
Args:
obj_name: str
vertices: shape = (nver, 3)
triangles: shape = (ntri, 3)
texture: shape = (256,256,3)
uv_coords: shape = (nver, 3) max value<=1
'''
if obj_name.split('.')[-1] != 'obj':
obj_name = obj_name + '.obj'
mtl_name = obj_name.replace('.obj', '.mtl')
texture_name = obj_name.replace('.obj', '_texture.png')
triangles = triangles.copy()
triangles += 1 # mesh lab start with 1
# write obj
with open(obj_name, 'w') as f:
# first line: write mtlib(material library)
s = "mtllib {}\n".format(os.path.abspath(mtl_name))
f.write(s)
# write vertices
for i in range(vertices.shape[0]):
s = 'v {} {} {}\n'.format(vertices[i, 0], vertices[i, 1], vertices[i, 2])
f.write(s)
# write uv coords
for i in range(uv_coords.shape[0]):
s = 'vt {} {}\n'.format(uv_coords[i,0], 1 - uv_coords[i,1])
f.write(s)
f.write("usemtl FaceTexture\n")
# write f: ver ind/ uv ind
for i in range(triangles.shape[0]):
s = 'f {}/{} {}/{} {}/{}\n'.format(triangles[i,2], triangles[i,2], triangles[i,1], triangles[i,1], triangles[i,0], triangles[i,0])
f.write(s)
# write mtl
with open(mtl_name, 'w') as f:
f.write("newmtl FaceTexture\n")
s = 'map_Kd {}\n'.format(os.path.abspath(texture_name)) # map to image
f.write(s)
# write texture as png
imsave(texture_name, texture)
# c++ version
def write_obj_with_colors_texture(obj_name, vertices, triangles, colors, texture, uv_coords):
''' Save 3D face model with texture.
Ref: https://github.com/patrikhuber/eos/blob/bd00155ebae4b1a13b08bf5a991694d682abbada/include/eos/core/Mesh.hpp
Args:
obj_name: str
vertices: shape = (nver, 3)
triangles: shape = (ntri, 3)
colors: shape = (nver, 3)
texture: shape = (256,256,3)
uv_coords: shape = (nver, 3) max value<=1
'''
if obj_name.split('.')[-1] != 'obj':
obj_name = obj_name + '.obj'
mtl_name = obj_name.replace('.obj', '.mtl')
texture_name = obj_name.replace('.obj', '_texture.png')
triangles = triangles.copy()
triangles += 1 # mesh lab start with 1
# write obj
vertices, colors, uv_coords = vertices.astype(np.float32).copy(), colors.astype(np.float32).copy(), uv_coords.astype(np.float32).copy()
mesh_core_cython.write_obj_with_colors_texture_core(str.encode(obj_name), str.encode(os.path.abspath(mtl_name)), vertices, triangles, colors, uv_coords, vertices.shape[0], triangles.shape[0], uv_coords.shape[0])
# write mtl
with open(mtl_name, 'w') as f:
f.write("newmtl FaceTexture\n")
s = 'map_Kd {}\n'.format(os.path.abspath(texture_name)) # map to image
f.write(s)
# write texture as png
io.imsave(texture_name, texture)

View File

@ -0,0 +1,213 @@
'''
Functions about lighting mesh(changing colors/texture of mesh).
1. add light to colors/texture (shade each vertex)
2. fit light according to colors/texture & image.
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from .cython import mesh_core_cython
def get_normal(vertices, triangles):
''' calculate normal direction in each vertex
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
Returns:
normal: [nver, 3]
'''
pt0 = vertices[triangles[:, 0], :] # [ntri, 3]
pt1 = vertices[triangles[:, 1], :] # [ntri, 3]
pt2 = vertices[triangles[:, 2], :] # [ntri, 3]
tri_normal = np.cross(pt0 - pt1, pt0 - pt2) # [ntri, 3]. normal of each triangle
normal = np.zeros_like(vertices, dtype = np.float32).copy() # [nver, 3]
# for i in range(triangles.shape[0]):
# normal[triangles[i, 0], :] = normal[triangles[i, 0], :] + tri_normal[i, :]
# normal[triangles[i, 1], :] = normal[triangles[i, 1], :] + tri_normal[i, :]
# normal[triangles[i, 2], :] = normal[triangles[i, 2], :] + tri_normal[i, :]
mesh_core_cython.get_normal_core(normal, tri_normal.astype(np.float32).copy(), triangles.copy(), triangles.shape[0])
# normalize to unit length
mag = np.sum(normal**2, 1) # [nver]
zero_ind = (mag == 0)
mag[zero_ind] = 1;
normal[zero_ind, 0] = np.ones((np.sum(zero_ind)))
normal = normal/np.sqrt(mag[:,np.newaxis])
return normal
# TODO: test
def add_light_sh(vertices, triangles, colors, sh_coeff):
'''
In 3d face, usually assume:
1. The surface of face is Lambertian(reflect only the low frequencies of lighting)
2. Lighting can be an arbitrary combination of point sources
--> can be expressed in terms of spherical harmonics(omit the lighting coefficients)
I = albedo * (sh(n) x sh_coeff)
albedo: n x 1
sh_coeff: 9 x 1
Y(n) = (1, n_x, n_y, n_z, n_xn_y, n_xn_z, n_yn_z, n_x^2 - n_y^2, 3n_z^2 - 1)': n x 9
# Y(n) = (1, n_x, n_y, n_z)': n x 4
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
colors: [nver, 3] albedo
sh_coeff: [9, 1] spherical harmonics coefficients
Returns:
lit_colors: [nver, 3]
'''
assert vertices.shape[0] == colors.shape[0]
nver = vertices.shape[0]
normal = get_normal(vertices, triangles) # [nver, 3]
sh = np.array((np.ones(nver), n[:,0], n[:,1], n[:,2], n[:,0]*n[:,1], n[:,0]*n[:,2], n[:,1]*n[:,2], n[:,0]**2 - n[:,1]**2, 3*(n[:,2]**2) - 1)) # [nver, 9]
ref = sh.dot(sh_coeff) #[nver, 1]
lit_colors = colors*ref
return lit_colors
def add_light(vertices, triangles, colors, light_positions = 0, light_intensities = 0):
''' Gouraud shading. add point lights.
In 3d face, usually assume:
1. The surface of face is Lambertian(reflect only the low frequencies of lighting)
2. Lighting can be an arbitrary combination of point sources
3. No specular (unless skin is oil, 23333)
Ref: https://cs184.eecs.berkeley.edu/lecture/pipeline
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
light_positions: [nlight, 3]
light_intensities: [nlight, 3]
Returns:
lit_colors: [nver, 3]
'''
nver = vertices.shape[0]
normals = get_normal(vertices, triangles) # [nver, 3]
# ambient
# La = ka*Ia
# diffuse
# Ld = kd*(I/r^2)max(0, nxl)
direction_to_lights = vertices[np.newaxis, :, :] - light_positions[:, np.newaxis, :] # [nlight, nver, 3]
direction_to_lights_n = np.sqrt(np.sum(direction_to_lights**2, axis = 2)) # [nlight, nver]
direction_to_lights = direction_to_lights/direction_to_lights_n[:, :, np.newaxis]
normals_dot_lights = normals[np.newaxis, :, :]*direction_to_lights # [nlight, nver, 3]
normals_dot_lights = np.sum(normals_dot_lights, axis = 2) # [nlight, nver]
diffuse_output = colors[np.newaxis, :, :]*normals_dot_lights[:, :, np.newaxis]*light_intensities[:, np.newaxis, :]
diffuse_output = np.sum(diffuse_output, axis = 0) # [nver, 3]
# specular
# h = (v + l)/(|v + l|) bisector
# Ls = ks*(I/r^2)max(0, nxh)^p
# increasing p narrows the reflectionlob
lit_colors = diffuse_output # only diffuse part here.
lit_colors = np.minimum(np.maximum(lit_colors, 0), 1)
return lit_colors
## TODO. estimate light(sh coeff)
## -------------------------------- estimate. can not use now.
def fit_light(image, vertices, colors, triangles, vis_ind, lamb = 10, max_iter = 3):
[h, w, c] = image.shape
# surface normal
norm = get_normal(vertices, triangles)
nver = vertices.shape[1]
# vertices --> corresponding image pixel
pt2d = vertices[:2, :]
pt2d[0,:] = np.minimum(np.maximum(pt2d[0,:], 0), w - 1)
pt2d[1,:] = np.minimum(np.maximum(pt2d[1,:], 0), h - 1)
pt2d = np.round(pt2d).astype(np.int32) # 2 x nver
image_pixel = image[pt2d[1,:], pt2d[0,:], :] # nver x 3
image_pixel = image_pixel.T # 3 x nver
# vertices --> corresponding mean texture pixel with illumination
# Spherical Harmonic Basis
harmonic_dim = 9
nx = norm[0,:];
ny = norm[1,:];
nz = norm[2,:];
harmonic = np.zeros((nver, harmonic_dim))
pi = np.pi
harmonic[:,0] = np.sqrt(1/(4*pi)) * np.ones((nver,));
harmonic[:,1] = np.sqrt(3/(4*pi)) * nx;
harmonic[:,2] = np.sqrt(3/(4*pi)) * ny;
harmonic[:,3] = np.sqrt(3/(4*pi)) * nz;
harmonic[:,4] = 1/2. * np.sqrt(3/(4*pi)) * (2*nz**2 - nx**2 - ny**2);
harmonic[:,5] = 3 * np.sqrt(5/(12*pi)) * (ny*nz);
harmonic[:,6] = 3 * np.sqrt(5/(12*pi)) * (nx*nz);
harmonic[:,7] = 3 * np.sqrt(5/(12*pi)) * (nx*ny);
harmonic[:,8] = 3/2. * np.sqrt(5/(12*pi)) * (nx*nx - ny*ny);
'''
I' = sum(albedo * lj * hj) j = 0:9 (albedo = tex)
set A = albedo*h (n x 9)
alpha = lj (9 x 1)
Y = I (n x 1)
Y' = A.dot(alpha)
opt function:
||Y - A*alpha|| + lambda*(alpha'*alpha)
result:
A'*(Y - A*alpha) + lambda*alpha = 0
==>
(A'*A*alpha - lambda)*alpha = A'*Y
left: 9 x 9
right: 9 x 1
'''
n_vis_ind = len(vis_ind)
n = n_vis_ind*c
Y = np.zeros((n, 1))
A = np.zeros((n, 9))
light = np.zeros((3, 1))
for k in range(c):
Y[k*n_vis_ind:(k+1)*n_vis_ind, :] = image_pixel[k, vis_ind][:, np.newaxis]
A[k*n_vis_ind:(k+1)*n_vis_ind, :] = texture[k, vis_ind][:, np.newaxis] * harmonic[vis_ind, :]
Ac = texture[k, vis_ind][:, np.newaxis]
Yc = image_pixel[k, vis_ind][:, np.newaxis]
light[k] = (Ac.T.dot(Yc))/(Ac.T.dot(Ac))
for i in range(max_iter):
Yc = Y.copy()
for k in range(c):
Yc[k*n_vis_ind:(k+1)*n_vis_ind, :] /= light[k]
# update alpha
equation_left = np.dot(A.T, A) + lamb*np.eye(harmonic_dim); # why + ?
equation_right = np.dot(A.T, Yc)
alpha = np.dot(np.linalg.inv(equation_left), equation_right)
# update light
for k in range(c):
Ac = A[k*n_vis_ind:(k+1)*n_vis_ind, :].dot(alpha)
Yc = Y[k*n_vis_ind:(k+1)*n_vis_ind, :]
light[k] = (Ac.T.dot(Yc))/(Ac.T.dot(Ac))
appearance = np.zeros_like(texture)
for k in range(c):
tmp = np.dot(harmonic*texture[k, :][:, np.newaxis], alpha*light[k])
appearance[k,:] = tmp.T
appearance = np.minimum(np.maximum(appearance, 0), 1)
return appearance

View File

@ -0,0 +1,135 @@
'''
functions about rendering mesh(from 3d obj to 2d image).
only use rasterization render here.
Note that:
1. Generally, render func includes camera, light, raterize. Here no camera and light(I write these in other files)
2. Generally, the input vertices are normalized to [-1,1] and cetered on [0, 0]. (in world space)
Here, the vertices are using image coords, which centers on [w/2, h/2] with the y-axis pointing to oppisite direction.
Means: render here only conducts interpolation.(I just want to make the input flexible)
Author: Yao Feng
Mail: yaofeng1995@gmail.com
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from time import time
from .cython import mesh_core_cython
def rasterize_triangles(vertices, triangles, h, w):
'''
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
h: height
w: width
Returns:
depth_buffer: [h, w] saves the depth, here, the bigger the z, the fronter the point.
triangle_buffer: [h, w] saves the tri id(-1 for no triangle).
barycentric_weight: [h, w, 3] saves corresponding barycentric weight.
# Each triangle has 3 vertices & Each vertex has 3 coordinates x, y, z.
# h, w is the size of rendering
'''
# initial
depth_buffer = np.zeros([h, w]) - 999999. #set the initial z to the farest position
triangle_buffer = np.zeros([h, w], dtype = np.int32) - 1 # if tri id = -1, the pixel has no triangle correspondance
barycentric_weight = np.zeros([h, w, 3], dtype = np.float32) #
vertices = vertices.astype(np.float32).copy()
triangles = triangles.astype(np.int32).copy()
mesh_core_cython.rasterize_triangles_core(
vertices, triangles,
depth_buffer, triangle_buffer, barycentric_weight,
vertices.shape[0], triangles.shape[0],
h, w)
def render_colors(vertices, triangles, colors, h, w, c = 3, BG = None):
''' render mesh with colors
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
colors: [nver, 3]
h: height
w: width
c: channel
BG: background image
Returns:
image: [h, w, c]. rendered image./rendering.
'''
# initial
if BG is None:
image = np.zeros((h, w, c), dtype = np.float32)
else:
assert BG.shape[0] == h and BG.shape[1] == w and BG.shape[2] == c
image = BG
depth_buffer = np.zeros([h, w], dtype = np.float32, order = 'C') - 999999.
# change orders. --> C-contiguous order(column major)
vertices = vertices.astype(np.float32).copy()
triangles = triangles.astype(np.int32).copy()
colors = colors.astype(np.float32).copy()
###
st = time()
mesh_core_cython.render_colors_core(
image, vertices, triangles,
colors,
depth_buffer,
vertices.shape[0], triangles.shape[0],
h, w, c)
return image
def render_texture(vertices, triangles, texture, tex_coords, tex_triangles, h, w, c = 3, mapping_type = 'nearest', BG = None):
''' render mesh with texture map
Args:
vertices: [3, nver]
triangles: [3, ntri]
texture: [tex_h, tex_w, 3]
tex_coords: [ntexcoords, 3]
tex_triangles: [ntri, 3]
h: height of rendering
w: width of rendering
c: channel
mapping_type: 'bilinear' or 'nearest'
'''
# initial
if BG is None:
image = np.zeros((h, w, c), dtype = np.float32)
else:
assert BG.shape[0] == h and BG.shape[1] == w and BG.shape[2] == c
image = BG
depth_buffer = np.zeros([h, w], dtype = np.float32, order = 'C') - 999999.
tex_h, tex_w, tex_c = texture.shape
if mapping_type == 'nearest':
mt = int(0)
elif mapping_type == 'bilinear':
mt = int(1)
else:
mt = int(0)
# -> C order
vertices = vertices.astype(np.float32).copy()
triangles = triangles.astype(np.int32).copy()
texture = texture.astype(np.float32).copy()
tex_coords = tex_coords.astype(np.float32).copy()
tex_triangles = tex_triangles.astype(np.int32).copy()
mesh_core_cython.render_texture_core(
image, vertices, triangles,
texture, tex_coords, tex_triangles,
depth_buffer,
vertices.shape[0], tex_coords.shape[0], triangles.shape[0],
h, w, c,
tex_h, tex_w, tex_c,
mt)
return image

View File

@ -0,0 +1,383 @@
'''
Functions about transforming mesh(changing the position: modify vertices).
1. forward: transform(transform, camera, project).
2. backward: estimate transform matrix from correspondences.
Author: Yao Feng
Mail: yaofeng1995@gmail.com
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import math
from math import cos, sin
def angle2matrix(angles):
''' get rotation matrix from three rotation angles(degree). right-handed.
Args:
angles: [3,]. x, y, z angles
x: pitch. positive for looking down.
y: yaw. positive for looking left.
z: roll. positive for tilting head right.
Returns:
R: [3, 3]. rotation matrix.
'''
x, y, z = np.deg2rad(angles[0]), np.deg2rad(angles[1]), np.deg2rad(angles[2])
# x
Rx=np.array([[1, 0, 0],
[0, cos(x), -sin(x)],
[0, sin(x), cos(x)]])
# y
Ry=np.array([[ cos(y), 0, sin(y)],
[ 0, 1, 0],
[-sin(y), 0, cos(y)]])
# z
Rz=np.array([[cos(z), -sin(z), 0],
[sin(z), cos(z), 0],
[ 0, 0, 1]])
R=Rz.dot(Ry.dot(Rx))
return R.astype(np.float32)
def angle2matrix_3ddfa(angles):
''' get rotation matrix from three rotation angles(radian). The same as in 3DDFA.
Args:
angles: [3,]. x, y, z angles
x: pitch.
y: yaw.
z: roll.
Returns:
R: 3x3. rotation matrix.
'''
# x, y, z = np.deg2rad(angles[0]), np.deg2rad(angles[1]), np.deg2rad(angles[2])
x, y, z = angles[0], angles[1], angles[2]
# x
Rx=np.array([[1, 0, 0],
[0, cos(x), sin(x)],
[0, -sin(x), cos(x)]])
# y
Ry=np.array([[ cos(y), 0, -sin(y)],
[ 0, 1, 0],
[sin(y), 0, cos(y)]])
# z
Rz=np.array([[cos(z), sin(z), 0],
[-sin(z), cos(z), 0],
[ 0, 0, 1]])
R = Rx.dot(Ry).dot(Rz)
return R.astype(np.float32)
## ------------------------------------------ 1. transform(transform, project, camera).
## ---------- 3d-3d transform. Transform obj in world space
def rotate(vertices, angles):
''' rotate vertices.
X_new = R.dot(X). X: 3 x 1
Args:
vertices: [nver, 3].
rx, ry, rz: degree angles
rx: pitch. positive for looking down
ry: yaw. positive for looking left
rz: roll. positive for tilting head right
Returns:
rotated vertices: [nver, 3]
'''
R = angle2matrix(angles)
rotated_vertices = vertices.dot(R.T)
return rotated_vertices
def similarity_transform(vertices, s, R, t3d):
''' similarity transform. dof = 7.
3D: s*R.dot(X) + t
Homo: M = [[sR, t],[0^T, 1]]. M.dot(X)
Args:(float32)
vertices: [nver, 3].
s: [1,]. scale factor.
R: [3,3]. rotation matrix.
t3d: [3,]. 3d translation vector.
Returns:
transformed vertices: [nver, 3]
'''
t3d = np.squeeze(np.array(t3d, dtype = np.float32))
transformed_vertices = s * vertices.dot(R.T) + t3d[np.newaxis, :]
return transformed_vertices
## -------------- Camera. from world space to camera space
# Ref: https://cs184.eecs.berkeley.edu/lecture/transforms-2
def normalize(x):
epsilon = 1e-12
norm = np.sqrt(np.sum(x**2, axis = 0))
norm = np.maximum(norm, epsilon)
return x/norm
def lookat_camera(vertices, eye, at = None, up = None):
""" 'look at' transformation: from world space to camera space
standard camera space:
camera located at the origin.
looking down negative z-axis.
vertical vector is y-axis.
Xcam = R(X - C)
Homo: [[R, -RC], [0, 1]]
Args:
vertices: [nver, 3]
eye: [3,] the XYZ world space position of the camera.
at: [3,] a position along the center of the camera's gaze.
up: [3,] up direction
Returns:
transformed_vertices: [nver, 3]
"""
if at is None:
at = np.array([0, 0, 0], np.float32)
if up is None:
up = np.array([0, 1, 0], np.float32)
eye = np.array(eye).astype(np.float32)
at = np.array(at).astype(np.float32)
z_aixs = -normalize(at - eye) # look forward
x_aixs = normalize(np.cross(up, z_aixs)) # look right
y_axis = np.cross(z_aixs, x_aixs) # look up
R = np.stack((x_aixs, y_axis, z_aixs))#, axis = 0) # 3 x 3
transformed_vertices = vertices - eye # translation
transformed_vertices = transformed_vertices.dot(R.T) # rotation
return transformed_vertices
## --------- 3d-2d project. from camera space to image plane
# generally, image plane only keeps x,y channels, here reserve z channel for calculating z-buffer.
def orthographic_project(vertices):
''' scaled orthographic projection(just delete z)
assumes: variations in depth over the object is small relative to the mean distance from camera to object
x -> x*f/z, y -> x*f/z, z -> f.
for point i,j. zi~=zj. so just delete z
** often used in face
Homo: P = [[1,0,0,0], [0,1,0,0], [0,0,1,0]]
Args:
vertices: [nver, 3]
Returns:
projected_vertices: [nver, 3] if isKeepZ=True. [nver, 2] if isKeepZ=False.
'''
return vertices.copy()
def perspective_project(vertices, fovy, aspect_ratio = 1., near = 0.1, far = 1000.):
''' perspective projection.
Args:
vertices: [nver, 3]
fovy: vertical angular field of view. degree.
aspect_ratio : width / height of field of view
near : depth of near clipping plane
far : depth of far clipping plane
Returns:
projected_vertices: [nver, 3]
'''
fovy = np.deg2rad(fovy)
top = near*np.tan(fovy)
bottom = -top
right = top*aspect_ratio
left = -right
#-- homo
P = np.array([[near/right, 0, 0, 0],
[0, near/top, 0, 0],
[0, 0, -(far+near)/(far-near), -2*far*near/(far-near)],
[0, 0, -1, 0]])
vertices_homo = np.hstack((vertices, np.ones((vertices.shape[0], 1)))) # [nver, 4]
projected_vertices = vertices_homo.dot(P.T)
projected_vertices = projected_vertices/projected_vertices[:,3:]
projected_vertices = projected_vertices[:,:3]
projected_vertices[:,2] = -projected_vertices[:,2]
#-- non homo. only fovy
# projected_vertices = vertices.copy()
# projected_vertices[:,0] = -(near/right)*vertices[:,0]/vertices[:,2]
# projected_vertices[:,1] = -(near/top)*vertices[:,1]/vertices[:,2]
return projected_vertices
def to_image(vertices, h, w, is_perspective = False):
''' change vertices to image coord system
3d system: XYZ, center(0, 0, 0)
2d image: x(u), y(v). center(w/2, h/2), flip y-axis.
Args:
vertices: [nver, 3]
h: height of the rendering
w : width of the rendering
Returns:
projected_vertices: [nver, 3]
'''
image_vertices = vertices.copy()
if is_perspective:
# if perspective, the projected vertices are normalized to [-1, 1]. so change it to image size first.
image_vertices[:,0] = image_vertices[:,0]*w/2
image_vertices[:,1] = image_vertices[:,1]*h/2
# move to center of image
image_vertices[:,0] = image_vertices[:,0] + w/2
image_vertices[:,1] = image_vertices[:,1] + h/2
# flip vertices along y-axis.
image_vertices[:,1] = h - image_vertices[:,1] - 1
return image_vertices
#### -------------------------------------------2. estimate transform matrix from correspondences.
def estimate_affine_matrix_3d23d(X, Y):
''' Using least-squares solution
Args:
X: [n, 3]. 3d points(fixed)
Y: [n, 3]. corresponding 3d points(moving). Y = PX
Returns:
P_Affine: (3, 4). Affine camera matrix (the third row is [0, 0, 0, 1]).
'''
X_homo = np.hstack((X, np.ones([X.shape[1],1]))) #n x 4
P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
return P
def estimate_affine_matrix_3d22d(X, x):
''' Using Golden Standard Algorithm for estimating an affine camera
matrix P from world to image correspondences.
See Alg.7.2. in MVGCV
Code Ref: https://github.com/patrikhuber/eos/blob/master/include/eos/fitting/affine_camera_estimation.hpp
x_homo = X_homo.dot(P_Affine)
Args:
X: [n, 3]. corresponding 3d points(fixed)
x: [n, 2]. n>=4. 2d points(moving). x = PX
Returns:
P_Affine: [3, 4]. Affine camera matrix
'''
X = X.T; x = x.T
assert(x.shape[1] == X.shape[1])
n = x.shape[1]
assert(n >= 4)
#--- 1. normalization
# 2d points
mean = np.mean(x, 1) # (2,)
x = x - np.tile(mean[:, np.newaxis], [1, n])
average_norm = np.mean(np.sqrt(np.sum(x**2, 0)))
scale = np.sqrt(2) / average_norm
x = scale * x
T = np.zeros((3,3), dtype = np.float32)
T[0, 0] = T[1, 1] = scale
T[:2, 2] = -mean*scale
T[2, 2] = 1
# 3d points
X_homo = np.vstack((X, np.ones((1, n))))
mean = np.mean(X, 1) # (3,)
X = X - np.tile(mean[:, np.newaxis], [1, n])
m = X_homo[:3,:] - X
average_norm = np.mean(np.sqrt(np.sum(X**2, 0)))
scale = np.sqrt(3) / average_norm
X = scale * X
U = np.zeros((4,4), dtype = np.float32)
U[0, 0] = U[1, 1] = U[2, 2] = scale
U[:3, 3] = -mean*scale
U[3, 3] = 1
# --- 2. equations
A = np.zeros((n*2, 8), dtype = np.float32);
X_homo = np.vstack((X, np.ones((1, n)))).T
A[:n, :4] = X_homo
A[n:, 4:] = X_homo
b = np.reshape(x, [-1, 1])
# --- 3. solution
p_8 = np.linalg.pinv(A).dot(b)
P = np.zeros((3, 4), dtype = np.float32)
P[0, :] = p_8[:4, 0]
P[1, :] = p_8[4:, 0]
P[-1, -1] = 1
# --- 4. denormalization
P_Affine = np.linalg.inv(T).dot(P.dot(U))
return P_Affine
def P2sRt(P):
''' decompositing camera matrix P
Args:
P: (3, 4). Affine Camera Matrix.
Returns:
s: scale factor.
R: (3, 3). rotation matrix.
t: (3,). translation.
'''
t = P[:, 3]
R1 = P[0:1, :3]
R2 = P[1:2, :3]
s = (np.linalg.norm(R1) + np.linalg.norm(R2))/2.0
r1 = R1/np.linalg.norm(R1)
r2 = R2/np.linalg.norm(R2)
r3 = np.cross(r1, r2)
R = np.concatenate((r1, r2, r3), 0)
return s, R, t
#Ref: https://www.learnopencv.com/rotation-matrix-to-euler-angles/
def isRotationMatrix(R):
''' checks if a matrix is a valid rotation matrix(whether orthogonal or not)
'''
Rt = np.transpose(R)
shouldBeIdentity = np.dot(Rt, R)
I = np.identity(3, dtype = R.dtype)
n = np.linalg.norm(I - shouldBeIdentity)
return n < 1e-6
def matrix2angle(R):
''' get three Euler angles from Rotation Matrix
Args:
R: (3,3). rotation matrix
Returns:
x: pitch
y: yaw
z: roll
'''
assert(isRotationMatrix)
sy = math.sqrt(R[0,0] * R[0,0] + R[1,0] * R[1,0])
singular = sy < 1e-6
if not singular :
x = math.atan2(R[2,1] , R[2,2])
y = math.atan2(-R[2,0], sy)
z = math.atan2(R[1,0], R[0,0])
else :
x = math.atan2(-R[1,2], R[1,1])
y = math.atan2(-R[2,0], sy)
z = 0
# rx, ry, rz = np.rad2deg(x), np.rad2deg(y), np.rad2deg(z)
rx, ry, rz = x*180/np.pi, y*180/np.pi, z*180/np.pi
return rx, ry, rz
# def matrix2angle(R):
# ''' compute three Euler angles from a Rotation Matrix. Ref: http://www.gregslabaugh.net/publications/euler.pdf
# Args:
# R: (3,3). rotation matrix
# Returns:
# x: yaw
# y: pitch
# z: roll
# '''
# # assert(isRotationMatrix(R))
# if R[2,0] !=1 or R[2,0] != -1:
# x = math.asin(R[2,0])
# y = math.atan2(R[2,1]/cos(x), R[2,2]/cos(x))
# z = math.atan2(R[1,0]/cos(x), R[0,0]/cos(x))
# else:# Gimbal lock
# z = 0 #can be anything
# if R[2,0] == -1:
# x = np.pi/2
# y = z + math.atan2(R[0,1], R[0,2])
# else:
# x = -np.pi/2
# y = -z + math.atan2(-R[0,1], -R[0,2])
# return x, y, z

View File

@ -0,0 +1,24 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from skimage import measure
from mpl_toolkits.mplot3d import Axes3D
def plot_mesh(vertices, triangles, subplot = [1,1,1], title = 'mesh', el = 90, az = -90, lwdt=.1, dist = 6, color = "grey"):
'''
plot the mesh
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
'''
ax = plt.subplot(subplot[0], subplot[1], subplot[2], projection = '3d')
ax.plot_trisurf(vertices[:, 0], vertices[:, 1], vertices[:, 2], triangles = triangles, lw = lwdt, color = color, alpha = 1)
ax.axis("off")
ax.view_init(elev = el, azim = az)
ax.dist = dist
plt.title(title)
### -------------- Todo: use vtk to visualize mesh? or visvis? or VisPy?

View File

@ -0,0 +1,10 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from . import io
from . import vis
from . import transform
from . import light
from . import render

View File

@ -0,0 +1,170 @@
''' io: read&write mesh
1. read obj as array(TODO)
2. write arrays to obj
Preparation knowledge:
representations of 3d face: mesh, point cloud...
storage format: obj, ply, bin, asc, mat...
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import os
from skimage import io
## TODO
## TODO: c++ version
def read_obj(obj_name):
''' read mesh
'''
return 0
# ------------------------- write
def write_asc(path, vertices):
'''
Args:
vertices: shape = (nver, 3)
'''
if path.split('.')[-1] == 'asc':
np.savetxt(path, vertices)
else:
np.savetxt(path + '.asc', vertices)
def write_obj_with_colors(obj_name, vertices, triangles, colors):
''' Save 3D face model with texture represented by colors.
Args:
obj_name: str
vertices: shape = (nver, 3)
triangles: shape = (ntri, 3)
colors: shape = (nver, 3)
'''
triangles = triangles.copy()
triangles += 1 # meshlab start with 1
if obj_name.split('.')[-1] != 'obj':
obj_name = obj_name + '.obj'
# write obj
with open(obj_name, 'w') as f:
# write vertices & colors
for i in range(vertices.shape[0]):
# s = 'v {} {} {} \n'.format(vertices[0,i], vertices[1,i], vertices[2,i])
s = 'v {} {} {} {} {} {}\n'.format(vertices[i, 0], vertices[i, 1], vertices[i, 2], colors[i, 0], colors[i, 1], colors[i, 2])
f.write(s)
# write f: ver ind/ uv ind
[k, ntri] = triangles.shape
for i in range(triangles.shape[0]):
# s = 'f {} {} {}\n'.format(triangles[i, 0], triangles[i, 1], triangles[i, 2])
s = 'f {} {} {}\n'.format(triangles[i, 2], triangles[i, 1], triangles[i, 0])
f.write(s)
## TODO: c++ version
def write_obj_with_texture(obj_name, vertices, triangles, texture, uv_coords):
''' Save 3D face model with texture represented by texture map.
Ref: https://github.com/patrikhuber/eos/blob/bd00155ebae4b1a13b08bf5a991694d682abbada/include/eos/core/Mesh.hpp
Args:
obj_name: str
vertices: shape = (nver, 3)
triangles: shape = (ntri, 3)
texture: shape = (256,256,3)
uv_coords: shape = (nver, 3) max value<=1
'''
if obj_name.split('.')[-1] != 'obj':
obj_name = obj_name + '.obj'
mtl_name = obj_name.replace('.obj', '.mtl')
texture_name = obj_name.replace('.obj', '_texture.png')
triangles = triangles.copy()
triangles += 1 # mesh lab start with 1
# write obj
with open(obj_name, 'w') as f:
# first line: write mtlib(material library)
s = "mtllib {}\n".format(os.path.abspath(mtl_name))
f.write(s)
# write vertices
for i in range(vertices.shape[0]):
s = 'v {} {} {}\n'.format(vertices[i, 0], vertices[i, 1], vertices[i, 2])
f.write(s)
# write uv coords
for i in range(uv_coords.shape[0]):
# s = 'vt {} {}\n'.format(uv_coords[i,0], 1 - uv_coords[i,1])
s = 'vt {} {}\n'.format(uv_coords[i,0], uv_coords[i,1])
f.write(s)
f.write("usemtl FaceTexture\n")
# write f: ver ind/ uv ind
for i in range(triangles.shape[0]):
s = 'f {}/{} {}/{} {}/{}\n'.format(triangles[i,2], triangles[i,2], triangles[i,1], triangles[i,1], triangles[i,0], triangles[i,0])
f.write(s)
# write mtl
with open(mtl_name, 'w') as f:
f.write("newmtl FaceTexture\n")
s = 'map_Kd {}\n'.format(os.path.abspath(texture_name)) # map to image
f.write(s)
# write texture as png
imsave(texture_name, texture)
def write_obj_with_colors_texture(obj_name, vertices, triangles, colors, texture, uv_coords):
''' Save 3D face model with texture.
Ref: https://github.com/patrikhuber/eos/blob/bd00155ebae4b1a13b08bf5a991694d682abbada/include/eos/core/Mesh.hpp
Args:
obj_name: str
vertices: shape = (nver, 3)
triangles: shape = (ntri, 3)
colors: shape = (nver, 3)
texture: shape = (256,256,3)
uv_coords: shape = (nver, 3) max value<=1
'''
if obj_name.split('.')[-1] != 'obj':
obj_name = obj_name + '.obj'
mtl_name = obj_name.replace('.obj', '.mtl')
texture_name = obj_name.replace('.obj', '_texture.png')
triangles = triangles.copy()
triangles += 1 # mesh lab start with 1
# write obj
with open(obj_name, 'w') as f:
# first line: write mtlib(material library)
s = "mtllib {}\n".format(os.path.abspath(mtl_name))
f.write(s)
# write vertices
for i in range(vertices.shape[0]):
s = 'v {} {} {} {} {} {}\n'.format(vertices[i, 0], vertices[i, 1], vertices[i, 2], colors[i, 0], colors[i, 1], colors[i, 2])
f.write(s)
# write uv coords
for i in range(uv_coords.shape[0]):
# s = 'vt {} {}\n'.format(uv_coords[i,0], 1 - uv_coords[i,1])
s = 'vt {} {}\n'.format(uv_coords[i,0], uv_coords[i,1])
f.write(s)
f.write("usemtl FaceTexture\n")
# write f: ver ind/ uv ind
for i in range(triangles.shape[0]):
# s = 'f {}/{} {}/{} {}/{}\n'.format(triangles[i,0], triangles[i,0], triangles[i,1], triangles[i,1], triangles[i,2], triangles[i,2])
s = 'f {}/{} {}/{} {}/{}\n'.format(triangles[i,2], triangles[i,2], triangles[i,1], triangles[i,1], triangles[i,0], triangles[i,0])
f.write(s)
# write mtl
with open(mtl_name, 'w') as f:
f.write("newmtl FaceTexture\n")
s = 'map_Kd {}\n'.format(os.path.abspath(texture_name)) # map to image
f.write(s)
# write texture as png
io.imsave(texture_name, texture)

View File

@ -0,0 +1,215 @@
'''
Functions about lighting mesh(changing colors/texture of mesh).
1. add light to colors/texture (shade each vertex)
2. fit light according to colors/texture & image.
Preparation knowledge:
lighting: https://cs184.eecs.berkeley.edu/lecture/pipeline
spherical harmonics in human face: '3D Face Reconstruction from a Single Image Using a Single Reference Face Shape'
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
def get_normal(vertices, triangles):
''' calculate normal direction in each vertex
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
Returns:
normal: [nver, 3]
'''
pt0 = vertices[triangles[:, 0], :] # [ntri, 3]
pt1 = vertices[triangles[:, 1], :] # [ntri, 3]
pt2 = vertices[triangles[:, 2], :] # [ntri, 3]
tri_normal = np.cross(pt0 - pt1, pt0 - pt2) # [ntri, 3]. normal of each triangle
normal = np.zeros_like(vertices) # [nver, 3]
for i in range(triangles.shape[0]):
normal[triangles[i, 0], :] = normal[triangles[i, 0], :] + tri_normal[i, :]
normal[triangles[i, 1], :] = normal[triangles[i, 1], :] + tri_normal[i, :]
normal[triangles[i, 2], :] = normal[triangles[i, 2], :] + tri_normal[i, :]
# normalize to unit length
mag = np.sum(normal**2, 1) # [nver]
zero_ind = (mag == 0)
mag[zero_ind] = 1;
normal[zero_ind, 0] = np.ones((np.sum(zero_ind)))
normal = normal/np.sqrt(mag[:,np.newaxis])
return normal
# TODO: test
def add_light_sh(vertices, triangles, colors, sh_coeff):
'''
In 3d face, usually assume:
1. The surface of face is Lambertian(reflect only the low frequencies of lighting)
2. Lighting can be an arbitrary combination of point sources
--> can be expressed in terms of spherical harmonics(omit the lighting coefficients)
I = albedo * (sh(n) x sh_coeff)
albedo: n x 1
sh_coeff: 9 x 1
Y(n) = (1, n_x, n_y, n_z, n_xn_y, n_xn_z, n_yn_z, n_x^2 - n_y^2, 3n_z^2 - 1)': n x 9
# Y(n) = (1, n_x, n_y, n_z)': n x 4
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
colors: [nver, 3] albedo
sh_coeff: [9, 1] spherical harmonics coefficients
Returns:
lit_colors: [nver, 3]
'''
assert vertices.shape[0] == colors.shape[0]
nver = vertices.shape[0]
normal = get_normal(vertices, triangles) # [nver, 3]
sh = np.array((np.ones(nver), n[:,0], n[:,1], n[:,2], n[:,0]*n[:,1], n[:,0]*n[:,2], n[:,1]*n[:,2], n[:,0]**2 - n[:,1]**2, 3*(n[:,2]**2) - 1)) # [nver, 9]
ref = sh.dot(sh_coeff) #[nver, 1]
lit_colors = colors*ref
return lit_colors
def add_light(vertices, triangles, colors, light_positions = 0, light_intensities = 0):
''' Gouraud shading. add point lights.
In 3d face, usually assume:
1. The surface of face is Lambertian(reflect only the low frequencies of lighting)
2. Lighting can be an arbitrary combination of point sources
3. No specular (unless skin is oil, 23333)
Ref: https://cs184.eecs.berkeley.edu/lecture/pipeline
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
light_positions: [nlight, 3]
light_intensities: [nlight, 3]
Returns:
lit_colors: [nver, 3]
'''
nver = vertices.shape[0]
normals = get_normal(vertices, triangles) # [nver, 3]
# ambient
# La = ka*Ia
# diffuse
# Ld = kd*(I/r^2)max(0, nxl)
direction_to_lights = vertices[np.newaxis, :, :] - light_positions[:, np.newaxis, :] # [nlight, nver, 3]
direction_to_lights_n = np.sqrt(np.sum(direction_to_lights**2, axis = 2)) # [nlight, nver]
direction_to_lights = direction_to_lights/direction_to_lights_n[:, :, np.newaxis]
normals_dot_lights = normals[np.newaxis, :, :]*direction_to_lights # [nlight, nver, 3]
normals_dot_lights = np.sum(normals_dot_lights, axis = 2) # [nlight, nver]
diffuse_output = colors[np.newaxis, :, :]*normals_dot_lights[:, :, np.newaxis]*light_intensities[:, np.newaxis, :]
diffuse_output = np.sum(diffuse_output, axis = 0) # [nver, 3]
# specular
# h = (v + l)/(|v + l|) bisector
# Ls = ks*(I/r^2)max(0, nxh)^p
# increasing p narrows the reflectionlob
lit_colors = diffuse_output # only diffuse part here.
lit_colors = np.minimum(np.maximum(lit_colors, 0), 1)
return lit_colors
## TODO. estimate light(sh coeff)
## -------------------------------- estimate. can not use now.
def fit_light(image, vertices, colors, triangles, vis_ind, lamb = 10, max_iter = 3):
[h, w, c] = image.shape
# surface normal
norm = get_normal(vertices, triangles)
nver = vertices.shape[1]
# vertices --> corresponding image pixel
pt2d = vertices[:2, :]
pt2d[0,:] = np.minimum(np.maximum(pt2d[0,:], 0), w - 1)
pt2d[1,:] = np.minimum(np.maximum(pt2d[1,:], 0), h - 1)
pt2d = np.round(pt2d).astype(np.int32) # 2 x nver
image_pixel = image[pt2d[1,:], pt2d[0,:], :] # nver x 3
image_pixel = image_pixel.T # 3 x nver
# vertices --> corresponding mean texture pixel with illumination
# Spherical Harmonic Basis
harmonic_dim = 9
nx = norm[0,:];
ny = norm[1,:];
nz = norm[2,:];
harmonic = np.zeros((nver, harmonic_dim))
pi = np.pi
harmonic[:,0] = np.sqrt(1/(4*pi)) * np.ones((nver,));
harmonic[:,1] = np.sqrt(3/(4*pi)) * nx;
harmonic[:,2] = np.sqrt(3/(4*pi)) * ny;
harmonic[:,3] = np.sqrt(3/(4*pi)) * nz;
harmonic[:,4] = 1/2. * np.sqrt(3/(4*pi)) * (2*nz**2 - nx**2 - ny**2);
harmonic[:,5] = 3 * np.sqrt(5/(12*pi)) * (ny*nz);
harmonic[:,6] = 3 * np.sqrt(5/(12*pi)) * (nx*nz);
harmonic[:,7] = 3 * np.sqrt(5/(12*pi)) * (nx*ny);
harmonic[:,8] = 3/2. * np.sqrt(5/(12*pi)) * (nx*nx - ny*ny);
'''
I' = sum(albedo * lj * hj) j = 0:9 (albedo = tex)
set A = albedo*h (n x 9)
alpha = lj (9 x 1)
Y = I (n x 1)
Y' = A.dot(alpha)
opt function:
||Y - A*alpha|| + lambda*(alpha'*alpha)
result:
A'*(Y - A*alpha) + lambda*alpha = 0
==>
(A'*A*alpha - lambda)*alpha = A'*Y
left: 9 x 9
right: 9 x 1
'''
n_vis_ind = len(vis_ind)
n = n_vis_ind*c
Y = np.zeros((n, 1))
A = np.zeros((n, 9))
light = np.zeros((3, 1))
for k in range(c):
Y[k*n_vis_ind:(k+1)*n_vis_ind, :] = image_pixel[k, vis_ind][:, np.newaxis]
A[k*n_vis_ind:(k+1)*n_vis_ind, :] = texture[k, vis_ind][:, np.newaxis] * harmonic[vis_ind, :]
Ac = texture[k, vis_ind][:, np.newaxis]
Yc = image_pixel[k, vis_ind][:, np.newaxis]
light[k] = (Ac.T.dot(Yc))/(Ac.T.dot(Ac))
for i in range(max_iter):
Yc = Y.copy()
for k in range(c):
Yc[k*n_vis_ind:(k+1)*n_vis_ind, :] /= light[k]
# update alpha
equation_left = np.dot(A.T, A) + lamb*np.eye(harmonic_dim); # why + ?
equation_right = np.dot(A.T, Yc)
alpha = np.dot(np.linalg.inv(equation_left), equation_right)
# update light
for k in range(c):
Ac = A[k*n_vis_ind:(k+1)*n_vis_ind, :].dot(alpha)
Yc = Y[k*n_vis_ind:(k+1)*n_vis_ind, :]
light[k] = (Ac.T.dot(Yc))/(Ac.T.dot(Ac))
appearance = np.zeros_like(texture)
for k in range(c):
tmp = np.dot(harmonic*texture[k, :][:, np.newaxis], alpha*light[k])
appearance[k,:] = tmp.T
appearance = np.minimum(np.maximum(appearance, 0), 1)
return appearance

View File

@ -0,0 +1,287 @@
'''
functions about rendering mesh(from 3d obj to 2d image).
only use rasterization render here.
Note that:
1. Generally, render func includes camera, light, raterize. Here no camera and light(I write these in other files)
2. Generally, the input vertices are normalized to [-1,1] and cetered on [0, 0]. (in world space)
Here, the vertices are using image coords, which centers on [w/2, h/2] with the y-axis pointing to oppisite direction.
Means: render here only conducts interpolation.(I just want to make the input flexible)
Preparation knowledge:
z-buffer: https://cs184.eecs.berkeley.edu/lecture/pipeline
Author: Yao Feng
Mail: yaofeng1995@gmail.com
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
from time import time
def isPointInTri(point, tri_points):
''' Judge whether the point is in the triangle
Method:
http://blackpawn.com/texts/pointinpoly/
Args:
point: (2,). [u, v] or [x, y]
tri_points: (3 vertices, 2 coords). three vertices(2d points) of a triangle.
Returns:
bool: true for in triangle
'''
tp = tri_points
# vectors
v0 = tp[2,:] - tp[0,:]
v1 = tp[1,:] - tp[0,:]
v2 = point - tp[0,:]
# dot products
dot00 = np.dot(v0.T, v0)
dot01 = np.dot(v0.T, v1)
dot02 = np.dot(v0.T, v2)
dot11 = np.dot(v1.T, v1)
dot12 = np.dot(v1.T, v2)
# barycentric coordinates
if dot00*dot11 - dot01*dot01 == 0:
inverDeno = 0
else:
inverDeno = 1/(dot00*dot11 - dot01*dot01)
u = (dot11*dot02 - dot01*dot12)*inverDeno
v = (dot00*dot12 - dot01*dot02)*inverDeno
# check if point in triangle
return (u >= 0) & (v >= 0) & (u + v < 1)
def get_point_weight(point, tri_points):
''' Get the weights of the position
Methods: https://gamedev.stackexchange.com/questions/23743/whats-the-most-efficient-way-to-find-barycentric-coordinates
-m1.compute the area of the triangles formed by embedding the point P inside the triangle
-m2.Christer Ericson's book "Real-Time Collision Detection". faster.(used)
Args:
point: (2,). [u, v] or [x, y]
tri_points: (3 vertices, 2 coords). three vertices(2d points) of a triangle.
Returns:
w0: weight of v0
w1: weight of v1
w2: weight of v3
'''
tp = tri_points
# vectors
v0 = tp[2,:] - tp[0,:]
v1 = tp[1,:] - tp[0,:]
v2 = point - tp[0,:]
# dot products
dot00 = np.dot(v0.T, v0)
dot01 = np.dot(v0.T, v1)
dot02 = np.dot(v0.T, v2)
dot11 = np.dot(v1.T, v1)
dot12 = np.dot(v1.T, v2)
# barycentric coordinates
if dot00*dot11 - dot01*dot01 == 0:
inverDeno = 0
else:
inverDeno = 1/(dot00*dot11 - dot01*dot01)
u = (dot11*dot02 - dot01*dot12)*inverDeno
v = (dot00*dot12 - dot01*dot02)*inverDeno
w0 = 1 - u - v
w1 = v
w2 = u
return w0, w1, w2
def rasterize_triangles(vertices, triangles, h, w):
'''
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
h: height
w: width
Returns:
depth_buffer: [h, w] saves the depth, here, the bigger the z, the fronter the point.
triangle_buffer: [h, w] saves the tri id(-1 for no triangle).
barycentric_weight: [h, w, 3] saves corresponding barycentric weight.
# Each triangle has 3 vertices & Each vertex has 3 coordinates x, y, z.
# h, w is the size of rendering
'''
# initial
depth_buffer = np.zeros([h, w]) - 999999. #+ np.min(vertices[2,:]) - 999999. # set the initial z to the farest position
triangle_buffer = np.zeros([h, w], dtype = np.int32) - 1 # if tri id = -1, the pixel has no triangle correspondance
barycentric_weight = np.zeros([h, w, 3], dtype = np.float32) #
for i in range(triangles.shape[0]):
tri = triangles[i, :] # 3 vertex indices
# the inner bounding box
umin = max(int(np.ceil(np.min(vertices[tri, 0]))), 0)
umax = min(int(np.floor(np.max(vertices[tri, 0]))), w-1)
vmin = max(int(np.ceil(np.min(vertices[tri, 1]))), 0)
vmax = min(int(np.floor(np.max(vertices[tri, 1]))), h-1)
if umax<umin or vmax<vmin:
continue
for u in range(umin, umax+1):
for v in range(vmin, vmax+1):
if not isPointInTri([u,v], vertices[tri, :2]):
continue
w0, w1, w2 = get_point_weight([u, v], vertices[tri, :2]) # barycentric weight
point_depth = w0*vertices[tri[0], 2] + w1*vertices[tri[1], 2] + w2*vertices[tri[2], 2]
if point_depth > depth_buffer[v, u]:
depth_buffer[v, u] = point_depth
triangle_buffer[v, u] = i
barycentric_weight[v, u, :] = np.array([w0, w1, w2])
return depth_buffer, triangle_buffer, barycentric_weight
def render_colors_ras(vertices, triangles, colors, h, w, c = 3):
''' render mesh with colors(rasterize triangle first)
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
colors: [nver, 3]
h: height
w: width
c: channel
Returns:
image: [h, w, c]. rendering.
'''
assert vertices.shape[0] == colors.shape[0]
depth_buffer, triangle_buffer, barycentric_weight = rasterize_triangles(vertices, triangles, h, w)
triangle_buffer_flat = np.reshape(triangle_buffer, [-1]) # [h*w]
barycentric_weight_flat = np.reshape(barycentric_weight, [-1, c]) #[h*w, c]
weight = barycentric_weight_flat[:, :, np.newaxis] # [h*w, 3(ver in tri), 1]
colors_flat = colors[triangles[triangle_buffer_flat, :], :] # [h*w(tri id in pixel), 3(ver in tri), c(color in ver)]
colors_flat = weight*colors_flat # [h*w, 3, 3]
colors_flat = np.sum(colors_flat, 1) #[h*w, 3]. add tri.
image = np.reshape(colors_flat, [h, w, c])
# mask = (triangle_buffer[:,:] > -1).astype(np.float32)
# image = image*mask[:,:,np.newaxis]
return image
def render_colors(vertices, triangles, colors, h, w, c = 3):
''' render mesh with colors
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
colors: [nver, 3]
h: height
w: width
Returns:
image: [h, w, c].
'''
assert vertices.shape[0] == colors.shape[0]
# initial
image = np.zeros((h, w, c))
depth_buffer = np.zeros([h, w]) - 999999.
for i in range(triangles.shape[0]):
tri = triangles[i, :] # 3 vertex indices
# the inner bounding box
umin = max(int(np.ceil(np.min(vertices[tri, 0]))), 0)
umax = min(int(np.floor(np.max(vertices[tri, 0]))), w-1)
vmin = max(int(np.ceil(np.min(vertices[tri, 1]))), 0)
vmax = min(int(np.floor(np.max(vertices[tri, 1]))), h-1)
if umax<umin or vmax<vmin:
continue
for u in range(umin, umax+1):
for v in range(vmin, vmax+1):
if not isPointInTri([u,v], vertices[tri, :2]):
continue
w0, w1, w2 = get_point_weight([u, v], vertices[tri, :2])
point_depth = w0*vertices[tri[0], 2] + w1*vertices[tri[1], 2] + w2*vertices[tri[2], 2]
if point_depth > depth_buffer[v, u]:
depth_buffer[v, u] = point_depth
image[v, u, :] = w0*colors[tri[0], :] + w1*colors[tri[1], :] + w2*colors[tri[2], :]
return image
def render_texture(vertices, triangles, texture, tex_coords, tex_triangles, h, w, c = 3, mapping_type = 'nearest'):
''' render mesh with texture map
Args:
vertices: [nver], 3
triangles: [ntri, 3]
texture: [tex_h, tex_w, 3]
tex_coords: [ntexcoords, 3]
tex_triangles: [ntri, 3]
h: height of rendering
w: width of rendering
c: channel
mapping_type: 'bilinear' or 'nearest'
'''
assert triangles.shape[0] == tex_triangles.shape[0]
tex_h, tex_w, _ = texture.shape
# initial
image = np.zeros((h, w, c))
depth_buffer = np.zeros([h, w]) - 999999.
for i in range(triangles.shape[0]):
tri = triangles[i, :] # 3 vertex indices
tex_tri = tex_triangles[i, :] # 3 tex indice
# the inner bounding box
umin = max(int(np.ceil(np.min(vertices[tri, 0]))), 0)
umax = min(int(np.floor(np.max(vertices[tri, 0]))), w-1)
vmin = max(int(np.ceil(np.min(vertices[tri, 1]))), 0)
vmax = min(int(np.floor(np.max(vertices[tri, 1]))), h-1)
if umax<umin or vmax<vmin:
continue
for u in range(umin, umax+1):
for v in range(vmin, vmax+1):
if not isPointInTri([u,v], vertices[tri, :2]):
continue
w0, w1, w2 = get_point_weight([u, v], vertices[tri, :2])
point_depth = w0*vertices[tri[0], 2] + w1*vertices[tri[1], 2] + w2*vertices[tri[2], 2]
if point_depth > depth_buffer[v, u]:
# update depth
depth_buffer[v, u] = point_depth
# tex coord
tex_xy = w0*tex_coords[tex_tri[0], :] + w1*tex_coords[tex_tri[1], :] + w2*tex_coords[tex_tri[2], :]
tex_xy[0] = max(min(tex_xy[0], float(tex_w - 1)), 0.0);
tex_xy[1] = max(min(tex_xy[1], float(tex_h - 1)), 0.0);
# nearest
if mapping_type == 'nearest':
tex_xy = np.round(tex_xy).astype(np.int32)
tex_value = texture[tex_xy[1], tex_xy[0], :]
# bilinear
elif mapping_type == 'bilinear':
# next 4 pixels
ul = texture[int(np.floor(tex_xy[1])), int(np.floor(tex_xy[0])), :]
ur = texture[int(np.floor(tex_xy[1])), int(np.ceil(tex_xy[0])), :]
dl = texture[int(np.ceil(tex_xy[1])), int(np.floor(tex_xy[0])), :]
dr = texture[int(np.ceil(tex_xy[1])), int(np.ceil(tex_xy[0])), :]
yd = tex_xy[1] - np.floor(tex_xy[1])
xd = tex_xy[0] - np.floor(tex_xy[0])
tex_value = ul*(1-xd)*(1-yd) + ur*xd*(1-yd) + dl*(1-xd)*yd + dr*xd*yd
image[v, u, :] = tex_value
return image

View File

@ -0,0 +1,385 @@
'''
Functions about transforming mesh(changing the position: modify vertices).
1. forward: transform(transform, camera, project).
2. backward: estimate transform matrix from correspondences.
Preparation knowledge:
transform&camera model:
https://cs184.eecs.berkeley.edu/lecture/transforms-2
Part I: camera geometry and single view geometry in MVGCV
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import math
from math import cos, sin
def angle2matrix(angles):
''' get rotation matrix from three rotation angles(degree). right-handed.
Args:
angles: [3,]. x, y, z angles
x: pitch. positive for looking down.
y: yaw. positive for looking left.
z: roll. positive for tilting head right.
Returns:
R: [3, 3]. rotation matrix.
'''
x, y, z = np.deg2rad(angles[0]), np.deg2rad(angles[1]), np.deg2rad(angles[2])
# x
Rx=np.array([[1, 0, 0],
[0, cos(x), -sin(x)],
[0, sin(x), cos(x)]])
# y
Ry=np.array([[ cos(y), 0, sin(y)],
[ 0, 1, 0],
[-sin(y), 0, cos(y)]])
# z
Rz=np.array([[cos(z), -sin(z), 0],
[sin(z), cos(z), 0],
[ 0, 0, 1]])
R=Rz.dot(Ry.dot(Rx))
return R.astype(np.float32)
def angle2matrix_3ddfa(angles):
''' get rotation matrix from three rotation angles(radian). The same as in 3DDFA.
Args:
angles: [3,]. x, y, z angles
x: pitch.
y: yaw.
z: roll.
Returns:
R: 3x3. rotation matrix.
'''
# x, y, z = np.deg2rad(angles[0]), np.deg2rad(angles[1]), np.deg2rad(angles[2])
x, y, z = angles[0], angles[1], angles[2]
# x
Rx=np.array([[1, 0, 0],
[0, cos(x), sin(x)],
[0, -sin(x), cos(x)]])
# y
Ry=np.array([[ cos(y), 0, -sin(y)],
[ 0, 1, 0],
[sin(y), 0, cos(y)]])
# z
Rz=np.array([[cos(z), sin(z), 0],
[-sin(z), cos(z), 0],
[ 0, 0, 1]])
R = Rx.dot(Ry).dot(Rz)
return R.astype(np.float32)
## ------------------------------------------ 1. transform(transform, project, camera).
## ---------- 3d-3d transform. Transform obj in world space
def rotate(vertices, angles):
''' rotate vertices.
X_new = R.dot(X). X: 3 x 1
Args:
vertices: [nver, 3].
rx, ry, rz: degree angles
rx: pitch. positive for looking down
ry: yaw. positive for looking left
rz: roll. positive for tilting head right
Returns:
rotated vertices: [nver, 3]
'''
R = angle2matrix(angles)
rotated_vertices = vertices.dot(R.T)
return rotated_vertices
def similarity_transform(vertices, s, R, t3d):
''' similarity transform. dof = 7.
3D: s*R.dot(X) + t
Homo: M = [[sR, t],[0^T, 1]]. M.dot(X)
Args:(float32)
vertices: [nver, 3].
s: [1,]. scale factor.
R: [3,3]. rotation matrix.
t3d: [3,]. 3d translation vector.
Returns:
transformed vertices: [nver, 3]
'''
t3d = np.squeeze(np.array(t3d, dtype = np.float32))
transformed_vertices = s * vertices.dot(R.T) + t3d[np.newaxis, :]
return transformed_vertices
## -------------- Camera. from world space to camera space
# Ref: https://cs184.eecs.berkeley.edu/lecture/transforms-2
def normalize(x):
epsilon = 1e-12
norm = np.sqrt(np.sum(x**2, axis = 0))
norm = np.maximum(norm, epsilon)
return x/norm
def lookat_camera(vertices, eye, at = None, up = None):
""" 'look at' transformation: from world space to camera space
standard camera space:
camera located at the origin.
looking down negative z-axis.
vertical vector is y-axis.
Xcam = R(X - C)
Homo: [[R, -RC], [0, 1]]
Args:
vertices: [nver, 3]
eye: [3,] the XYZ world space position of the camera.
at: [3,] a position along the center of the camera's gaze.
up: [3,] up direction
Returns:
transformed_vertices: [nver, 3]
"""
if at is None:
at = np.array([0, 0, 0], np.float32)
if up is None:
up = np.array([0, 1, 0], np.float32)
eye = np.array(eye).astype(np.float32)
at = np.array(at).astype(np.float32)
z_aixs = -normalize(at - eye) # look forward
x_aixs = normalize(np.cross(up, z_aixs)) # look right
y_axis = np.cross(z_aixs, x_aixs) # look up
R = np.stack((x_aixs, y_axis, z_aixs))#, axis = 0) # 3 x 3
transformed_vertices = vertices - eye # translation
transformed_vertices = transformed_vertices.dot(R.T) # rotation
return transformed_vertices
## --------- 3d-2d project. from camera space to image plane
# generally, image plane only keeps x,y channels, here reserve z channel for calculating z-buffer.
def orthographic_project(vertices):
''' scaled orthographic projection(just delete z)
assumes: variations in depth over the object is small relative to the mean distance from camera to object
x -> x*f/z, y -> x*f/z, z -> f.
for point i,j. zi~=zj. so just delete z
** often used in face
Homo: P = [[1,0,0,0], [0,1,0,0], [0,0,1,0]]
Args:
vertices: [nver, 3]
Returns:
projected_vertices: [nver, 3] if isKeepZ=True. [nver, 2] if isKeepZ=False.
'''
return vertices.copy()
def perspective_project(vertices, fovy, aspect_ratio = 1., near = 0.1, far = 1000.):
''' perspective projection.
Args:
vertices: [nver, 3]
fovy: vertical angular field of view. degree.
aspect_ratio : width / height of field of view
near : depth of near clipping plane
far : depth of far clipping plane
Returns:
projected_vertices: [nver, 3]
'''
fovy = np.deg2rad(fovy)
top = near*np.tan(fovy)
bottom = -top
right = top*aspect_ratio
left = -right
#-- homo
P = np.array([[near/right, 0, 0, 0],
[0, near/top, 0, 0],
[0, 0, -(far+near)/(far-near), -2*far*near/(far-near)],
[0, 0, -1, 0]])
vertices_homo = np.hstack((vertices, np.ones((vertices.shape[0], 1)))) # [nver, 4]
projected_vertices = vertices_homo.dot(P.T)
projected_vertices = projected_vertices/projected_vertices[:,3:]
projected_vertices = projected_vertices[:,:3]
projected_vertices[:,2] = -projected_vertices[:,2]
#-- non homo. only fovy
# projected_vertices = vertices.copy()
# projected_vertices[:,0] = -(near/right)*vertices[:,0]/vertices[:,2]
# projected_vertices[:,1] = -(near/top)*vertices[:,1]/vertices[:,2]
return projected_vertices
def to_image(vertices, h, w, is_perspective = False):
''' change vertices to image coord system
3d system: XYZ, center(0, 0, 0)
2d image: x(u), y(v). center(w/2, h/2), flip y-axis.
Args:
vertices: [nver, 3]
h: height of the rendering
w : width of the rendering
Returns:
projected_vertices: [nver, 3]
'''
image_vertices = vertices.copy()
if is_perspective:
# if perspective, the projected vertices are normalized to [-1, 1]. so change it to image size first.
image_vertices[:,0] = image_vertices[:,0]*w/2
image_vertices[:,1] = image_vertices[:,1]*h/2
# move to center of image
image_vertices[:,0] = image_vertices[:,0] + w/2
image_vertices[:,1] = image_vertices[:,1] + h/2
# flip vertices along y-axis.
image_vertices[:,1] = h - image_vertices[:,1] - 1
return image_vertices
#### -------------------------------------------2. estimate transform matrix from correspondences.
def estimate_affine_matrix_3d23d(X, Y):
''' Using least-squares solution
Args:
X: [n, 3]. 3d points(fixed)
Y: [n, 3]. corresponding 3d points(moving). Y = PX
Returns:
P_Affine: (3, 4). Affine camera matrix (the third row is [0, 0, 0, 1]).
'''
X_homo = np.hstack((X, np.ones([X.shape[1],1]))) #n x 4
P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
return P
def estimate_affine_matrix_3d22d(X, x):
''' Using Golden Standard Algorithm for estimating an affine camera
matrix P from world to image correspondences.
See Alg.7.2. in MVGCV
Code Ref: https://github.com/patrikhuber/eos/blob/master/include/eos/fitting/affine_camera_estimation.hpp
x_homo = X_homo.dot(P_Affine)
Args:
X: [n, 3]. corresponding 3d points(fixed)
x: [n, 2]. n>=4. 2d points(moving). x = PX
Returns:
P_Affine: [3, 4]. Affine camera matrix
'''
X = X.T; x = x.T
assert(x.shape[1] == X.shape[1])
n = x.shape[1]
assert(n >= 4)
#--- 1. normalization
# 2d points
mean = np.mean(x, 1) # (2,)
x = x - np.tile(mean[:, np.newaxis], [1, n])
average_norm = np.mean(np.sqrt(np.sum(x**2, 0)))
scale = np.sqrt(2) / average_norm
x = scale * x
T = np.zeros((3,3), dtype = np.float32)
T[0, 0] = T[1, 1] = scale
T[:2, 2] = -mean*scale
T[2, 2] = 1
# 3d points
X_homo = np.vstack((X, np.ones((1, n))))
mean = np.mean(X, 1) # (3,)
X = X - np.tile(mean[:, np.newaxis], [1, n])
m = X_homo[:3,:] - X
average_norm = np.mean(np.sqrt(np.sum(X**2, 0)))
scale = np.sqrt(3) / average_norm
X = scale * X
U = np.zeros((4,4), dtype = np.float32)
U[0, 0] = U[1, 1] = U[2, 2] = scale
U[:3, 3] = -mean*scale
U[3, 3] = 1
# --- 2. equations
A = np.zeros((n*2, 8), dtype = np.float32);
X_homo = np.vstack((X, np.ones((1, n)))).T
A[:n, :4] = X_homo
A[n:, 4:] = X_homo
b = np.reshape(x, [-1, 1])
# --- 3. solution
p_8 = np.linalg.pinv(A).dot(b)
P = np.zeros((3, 4), dtype = np.float32)
P[0, :] = p_8[:4, 0]
P[1, :] = p_8[4:, 0]
P[-1, -1] = 1
# --- 4. denormalization
P_Affine = np.linalg.inv(T).dot(P.dot(U))
return P_Affine
def P2sRt(P):
''' decompositing camera matrix P
Args:
P: (3, 4). Affine Camera Matrix.
Returns:
s: scale factor.
R: (3, 3). rotation matrix.
t: (3,). translation.
'''
t = P[:, 3]
R1 = P[0:1, :3]
R2 = P[1:2, :3]
s = (np.linalg.norm(R1) + np.linalg.norm(R2))/2.0
r1 = R1/np.linalg.norm(R1)
r2 = R2/np.linalg.norm(R2)
r3 = np.cross(r1, r2)
R = np.concatenate((r1, r2, r3), 0)
return s, R, t
#Ref: https://www.learnopencv.com/rotation-matrix-to-euler-angles/
def isRotationMatrix(R):
''' checks if a matrix is a valid rotation matrix(whether orthogonal or not)
'''
Rt = np.transpose(R)
shouldBeIdentity = np.dot(Rt, R)
I = np.identity(3, dtype = R.dtype)
n = np.linalg.norm(I - shouldBeIdentity)
return n < 1e-6
def matrix2angle(R):
''' get three Euler angles from Rotation Matrix
Args:
R: (3,3). rotation matrix
Returns:
x: pitch
y: yaw
z: roll
'''
assert(isRotationMatrix)
sy = math.sqrt(R[0,0] * R[0,0] + R[1,0] * R[1,0])
singular = sy < 1e-6
if not singular :
x = math.atan2(R[2,1] , R[2,2])
y = math.atan2(-R[2,0], sy)
z = math.atan2(R[1,0], R[0,0])
else :
x = math.atan2(-R[1,2], R[1,1])
y = math.atan2(-R[2,0], sy)
z = 0
# rx, ry, rz = np.rad2deg(x), np.rad2deg(y), np.rad2deg(z)
rx, ry, rz = x*180/np.pi, y*180/np.pi, z*180/np.pi
return rx, ry, rz
# def matrix2angle(R):
# ''' compute three Euler angles from a Rotation Matrix. Ref: http://www.gregslabaugh.net/publications/euler.pdf
# Args:
# R: (3,3). rotation matrix
# Returns:
# x: yaw
# y: pitch
# z: roll
# '''
# # assert(isRotationMatrix(R))
# if R[2,0] !=1 or R[2,0] != -1:
# x = math.asin(R[2,0])
# y = math.atan2(R[2,1]/cos(x), R[2,2]/cos(x))
# z = math.atan2(R[1,0]/cos(x), R[0,0]/cos(x))
# else:# Gimbal lock
# z = 0 #can be anything
# if R[2,0] == -1:
# x = np.pi/2
# y = z + math.atan2(R[0,1], R[0,2])
# else:
# x = -np.pi/2
# y = -z + math.atan2(-R[0,1], -R[0,2])
# return x, y, z

View File

@ -0,0 +1,24 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from skimage import measure
from mpl_toolkits.mplot3d import Axes3D
def plot_mesh(vertices, triangles, subplot = [1,1,1], title = 'mesh', el = 90, az = -90, lwdt=.1, dist = 6, color = "grey"):
'''
plot the mesh
Args:
vertices: [nver, 3]
triangles: [ntri, 3]
'''
ax = plt.subplot(subplot[0], subplot[1], subplot[2], projection = '3d')
ax.plot_trisurf(vertices[:, 0], vertices[:, 1], vertices[:, 2], triangles = triangles, lw = lwdt, color = color, alpha = 1)
ax.axis("off")
ax.view_init(elev = el, azim = az)
ax.dist = dist
plt.title(title)
### -------------- Todo: use vtk to visualize mesh? or visvis? or VisPy?

View File

@ -0,0 +1,7 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from .. import mesh
from .morphabel_model import MorphabelModel
from . import load

View File

@ -0,0 +1,272 @@
'''
Estimating parameters about vertices: shape para, exp para, pose para(s, R, t)
'''
import numpy as np
from .. import mesh
''' TODO: a clear document.
Given: image_points, 3D Model, Camera Matrix(s, R, t2d)
Estimate: shape parameters, expression parameters
Inference:
projected_vertices = s*P*R(mu + shape + exp) + t2d --> image_points
s*P*R*shape + s*P*R(mu + exp) + t2d --> image_poitns
# Define:
X = vertices
x_hat = projected_vertices
x = image_points
A = s*P*R
b = s*P*R(mu + exp) + t2d
==>
x_hat = A*shape + b (2 x n)
A*shape (2 x n)
shape = reshape(shapePC * sp) (3 x n)
shapePC*sp : (3n x 1)
* flatten:
x_hat_flatten = A*shape + b_flatten (2n x 1)
A*shape (2n x 1)
--> A*shapePC (2n x 199) sp: 199 x 1
# Define:
pc_2d = A* reshape(shapePC)
pc_2d_flatten = flatten(pc_2d) (2n x 199)
=====>
x_hat_flatten = pc_2d_flatten * sp + b_flatten ---> x_flatten (2n x 1)
Goals:
(ignore flatten, pc_2d-->pc)
min E = || x_hat - x || + lambda*sum(sp/sigma)^2
= || pc * sp + b - x || + lambda*sum(sp/sigma)^2
Solve:
d(E)/d(sp) = 0
2 * pc' * (pc * sp + b - x) + 2 * lambda * sp / (sigma' * sigma) = 0
Get:
(pc' * pc + lambda / (sigma'* sigma)) * sp = pc' * (x - b)
'''
def estimate_shape(x, shapeMU, shapePC, shapeEV, expression, s, R, t2d, lamb = 3000):
'''
Args:
x: (2, n). image points (to be fitted)
shapeMU: (3n, 1)
shapePC: (3n, n_sp)
shapeEV: (n_sp, 1)
expression: (3, n)
s: scale
R: (3, 3). rotation matrix
t2d: (2,). 2d translation
lambda: regulation coefficient
Returns:
shape_para: (n_sp, 1) shape parameters(coefficients)
'''
x = x.copy()
assert(shapeMU.shape[0] == shapePC.shape[0])
assert(shapeMU.shape[0] == x.shape[1]*3)
dof = shapePC.shape[1]
n = x.shape[1]
sigma = shapeEV
t2d = np.array(t2d)
P = np.array([[1, 0, 0], [0, 1, 0]], dtype = np.float32)
A = s*P.dot(R)
# --- calc pc
pc_3d = np.resize(shapePC.T, [dof, n, 3]) # 199 x n x 3
pc_3d = np.reshape(pc_3d, [dof*n, 3])
pc_2d = pc_3d.dot(A.T.copy()) # 199 x n x 2
pc = np.reshape(pc_2d, [dof, -1]).T # 2n x 199
# --- calc b
# shapeMU
mu_3d = np.resize(shapeMU, [n, 3]).T # 3 x n
# expression
exp_3d = expression
#
b = A.dot(mu_3d + exp_3d) + np.tile(t2d[:, np.newaxis], [1, n]) # 2 x n
b = np.reshape(b.T, [-1, 1]) # 2n x 1
# --- solve
equation_left = np.dot(pc.T, pc) + lamb * np.diagflat(1/sigma**2)
x = np.reshape(x.T, [-1, 1])
equation_right = np.dot(pc.T, x - b)
shape_para = np.dot(np.linalg.inv(equation_left), equation_right)
return shape_para
def estimate_expression(x, shapeMU, expPC, expEV, shape, s, R, t2d, lamb = 2000):
'''
Args:
x: (2, n). image points (to be fitted)
shapeMU: (3n, 1)
expPC: (3n, n_ep)
expEV: (n_ep, 1)
shape: (3, n)
s: scale
R: (3, 3). rotation matrix
t2d: (2,). 2d translation
lambda: regulation coefficient
Returns:
exp_para: (n_ep, 1) shape parameters(coefficients)
'''
x = x.copy()
assert(shapeMU.shape[0] == expPC.shape[0])
assert(shapeMU.shape[0] == x.shape[1]*3)
dof = expPC.shape[1]
n = x.shape[1]
sigma = expEV
t2d = np.array(t2d)
P = np.array([[1, 0, 0], [0, 1, 0]], dtype = np.float32)
A = s*P.dot(R)
# --- calc pc
pc_3d = np.resize(expPC.T, [dof, n, 3])
pc_3d = np.reshape(pc_3d, [dof*n, 3])
pc_2d = pc_3d.dot(A.T)
pc = np.reshape(pc_2d, [dof, -1]).T # 2n x 29
# --- calc b
# shapeMU
mu_3d = np.resize(shapeMU, [n, 3]).T # 3 x n
# expression
shape_3d = shape
#
b = A.dot(mu_3d + shape_3d) + np.tile(t2d[:, np.newaxis], [1, n]) # 2 x n
b = np.reshape(b.T, [-1, 1]) # 2n x 1
# --- solve
equation_left = np.dot(pc.T, pc) + lamb * np.diagflat(1/sigma**2)
x = np.reshape(x.T, [-1, 1])
equation_right = np.dot(pc.T, x - b)
exp_para = np.dot(np.linalg.inv(equation_left), equation_right)
return exp_para
# ---------------- fit
def fit_points(x, X_ind, model, n_sp, n_ep, max_iter = 4):
'''
Args:
x: (n, 2) image points
X_ind: (n,) corresponding Model vertex indices
model: 3DMM
max_iter: iteration
Returns:
sp: (n_sp, 1). shape parameters
ep: (n_ep, 1). exp parameters
s, R, t
'''
x = x.copy().T
#-- init
sp = np.zeros((n_sp, 1), dtype = np.float32)
ep = np.zeros((n_ep, 1), dtype = np.float32)
#-------------------- estimate
X_ind_all = np.tile(X_ind[np.newaxis, :], [3, 1])*3
X_ind_all[1, :] += 1
X_ind_all[2, :] += 2
valid_ind = X_ind_all.flatten('F')
shapeMU = model['shapeMU'][valid_ind, :]
shapePC = model['shapePC'][valid_ind, :n_sp]
expPC = model['expPC'][valid_ind, :n_ep]
for i in range(max_iter):
X = shapeMU + shapePC.dot(sp) + expPC.dot(ep)
X = np.reshape(X, [int(len(X)/3), 3]).T
#----- estimate pose
P = mesh.transform.estimate_affine_matrix_3d22d(X.T, x.T)
s, R, t = mesh.transform.P2sRt(P)
rx, ry, rz = mesh.transform.matrix2angle(R)
#print('Iter:{}; estimated pose: s {}, rx {}, ry {}, rz {}, t1 {}, t2 {}'.format(i, s, rx, ry, rz, t[0], t[1]))
#----- estimate shape
# expression
shape = shapePC.dot(sp)
shape = np.reshape(shape, [int(len(shape)/3), 3]).T
ep = estimate_expression(x, shapeMU, expPC, model['expEV'][:n_ep,:], shape, s, R, t[:2], lamb = 20)
# shape
expression = expPC.dot(ep)
expression = np.reshape(expression, [int(len(expression)/3), 3]).T
if i == 0 :
sp = estimate_shape(x, shapeMU, shapePC, model['shapeEV'][:n_sp,:], expression, s, R, t[:2], lamb = 40)
return sp, ep, s, R, t
# ---------------- fitting process
def fit_points_for_show(x, X_ind, model, n_sp, n_ep, max_iter = 4):
'''
Args:
x: (n, 2) image points
X_ind: (n,) corresponding Model vertex indices
model: 3DMM
max_iter: iteration
Returns:
sp: (n_sp, 1). shape parameters
ep: (n_ep, 1). exp parameters
s, R, t
'''
x = x.copy().T
#-- init
sp = np.zeros((n_sp, 1), dtype = np.float32)
ep = np.zeros((n_ep, 1), dtype = np.float32)
#-------------------- estimate
X_ind_all = np.tile(X_ind[np.newaxis, :], [3, 1])*3
X_ind_all[1, :] += 1
X_ind_all[2, :] += 2
valid_ind = X_ind_all.flatten('F')
shapeMU = model['shapeMU'][valid_ind, :]
shapePC = model['shapePC'][valid_ind, :n_sp]
expPC = model['expPC'][valid_ind, :n_ep]
s = 4e-04
R = mesh.transform.angle2matrix([0, 0, 0])
t = [0, 0, 0]
lsp = []; lep = []; ls = []; lR = []; lt = []
for i in range(max_iter):
X = shapeMU + shapePC.dot(sp) + expPC.dot(ep)
X = np.reshape(X, [int(len(X)/3), 3]).T
lsp.append(sp); lep.append(ep); ls.append(s), lR.append(R), lt.append(t)
#----- estimate pose
P = mesh.transform.estimate_affine_matrix_3d22d(X.T, x.T)
s, R, t = mesh.transform.P2sRt(P)
lsp.append(sp); lep.append(ep); ls.append(s), lR.append(R), lt.append(t)
#----- estimate shape
# expression
shape = shapePC.dot(sp)
shape = np.reshape(shape, [int(len(shape)/3), 3]).T
ep = estimate_expression(x, shapeMU, expPC, model['expEV'][:n_ep,:], shape, s, R, t[:2], lamb = 20)
lsp.append(sp); lep.append(ep); ls.append(s), lR.append(R), lt.append(t)
# shape
expression = expPC.dot(ep)
expression = np.reshape(expression, [int(len(expression)/3), 3]).T
sp = estimate_shape(x, shapeMU, shapePC, model['shapeEV'][:n_sp,:], expression, s, R, t[:2], lamb = 40)
# print('ls', ls)
# print('lR', lR)
return np.array(lsp), np.array(lep), np.array(ls), np.array(lR), np.array(lt)

View File

@ -0,0 +1,110 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import scipy.io as sio
### --------------------------------- load BFM data
def load_BFM(model_path):
''' load BFM 3DMM model
Args:
model_path: path to BFM model.
Returns:
model: (nver = 53215, ntri = 105840). nver: number of vertices. ntri: number of triangles.
'shapeMU': [3*nver, 1]
'shapePC': [3*nver, 199]
'shapeEV': [199, 1]
'expMU': [3*nver, 1]
'expPC': [3*nver, 29]
'expEV': [29, 1]
'texMU': [3*nver, 1]
'texPC': [3*nver, 199]
'texEV': [199, 1]
'tri': [ntri, 3] (start from 1, should sub 1 in python and c++)
'tri_mouth': [114, 3] (start from 1, as a supplement to mouth triangles)
'kpt_ind': [68,] (start from 1)
PS:
You can change codes according to your own saved data.
Just make sure the model has corresponding attributes.
'''
C = sio.loadmat(model_path)
model = C['model']
model = model[0,0]
# change dtype from double(np.float64) to np.float32,
# since big matrix process(espetially matrix dot) is too slow in python.
model['shapeMU'] = (model['shapeMU'] + model['expMU']).astype(np.float32)
model['shapePC'] = model['shapePC'].astype(np.float32)
model['shapeEV'] = model['shapeEV'].astype(np.float32)
model['expEV'] = model['expEV'].astype(np.float32)
model['expPC'] = model['expPC'].astype(np.float32)
# matlab start with 1. change to 0 in python.
model['tri'] = model['tri'].T.copy(order = 'C').astype(np.int32) - 1
model['tri_mouth'] = model['tri_mouth'].T.copy(order = 'C').astype(np.int32) - 1
# kpt ind
model['kpt_ind'] = (np.squeeze(model['kpt_ind']) - 1).astype(np.int32)
return model
def load_BFM_info(path = 'BFM_info.mat'):
''' load 3DMM model extra information
Args:
path: path to BFM info.
Returns:
model_info:
'symlist': 2 x 26720
'symlist_tri': 2 x 52937
'segbin': 4 x n (0: nose, 1: eye, 2: mouth, 3: cheek)
'segbin_tri': 4 x ntri
'face_contour': 1 x 28
'face_contour_line': 1 x 512
'face_contour_front': 1 x 28
'face_contour_front_line': 1 x 512
'nose_hole': 1 x 142
'nose_hole_right': 1 x 71
'nose_hole_left': 1 x 71
'parallel': 17 x 1 cell
'parallel_face_contour': 28 x 1 cell
'uv_coords': n x 2
'''
C = sio.loadmat(path)
model_info = C['model_info']
model_info = model_info[0,0]
return model_info
def load_uv_coords(path = 'BFM_UV.mat'):
''' load uv coords of BFM
Args:
path: path to data.
Returns:
uv_coords: [nver, 2]. range: 0-1
'''
C = sio.loadmat(path)
uv_coords = C['UV'].copy(order = 'C')
return uv_coords
def load_pncc_code(path = 'pncc_code.mat'):
''' load pncc code of BFM
PNCC code: Defined in 'Face Alignment Across Large Poses: A 3D Solution Xiangyu'
download at http://www.cbsr.ia.ac.cn/users/xiangyuzhu/projects/3DDFA/main.htm.
Args:
path: path to data.
Returns:
pncc_code: [nver, 3]
'''
C = sio.loadmat(path)
pncc_code = C['vertex_code'].T
return pncc_code
##
def get_organ_ind(model_info):
''' get nose, eye, mouth index
'''
valid_bin = model_info['segbin'].astype(bool)
organ_ind = np.nonzero(valid_bin[0,:])[0]
for i in range(1, valid_bin.shape[0] - 1):
organ_ind = np.union1d(organ_ind, np.nonzero(valid_bin[i,:])[0])
return organ_ind.astype(np.int32)

View File

@ -0,0 +1,143 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import scipy.io as sio
from .. import mesh
from . import fit
from . import load
class MorphabelModel(object):
"""docstring for MorphabelModel
model: nver: number of vertices. ntri: number of triangles. *: must have. ~: can generate ones array for place holder.
'shapeMU': [3*nver, 1]. *
'shapePC': [3*nver, n_shape_para]. *
'shapeEV': [n_shape_para, 1]. ~
'expMU': [3*nver, 1]. ~
'expPC': [3*nver, n_exp_para]. ~
'expEV': [n_exp_para, 1]. ~
'texMU': [3*nver, 1]. ~
'texPC': [3*nver, n_tex_para]. ~
'texEV': [n_tex_para, 1]. ~
'tri': [ntri, 3] (start from 1, should sub 1 in python and c++). *
'tri_mouth': [114, 3] (start from 1, as a supplement to mouth triangles). ~
'kpt_ind': [68,] (start from 1). ~
"""
def __init__(self, model_path, model_type = 'BFM'):
super( MorphabelModel, self).__init__()
if model_type=='BFM':
self.model = load.load_BFM(model_path)
else:
print('sorry, not support other 3DMM model now')
exit()
# fixed attributes
self.nver = self.model['shapePC'].shape[0]/3
self.ntri = self.model['tri'].shape[0]
self.n_shape_para = self.model['shapePC'].shape[1]
self.n_exp_para = self.model['expPC'].shape[1]
self.n_tex_para = self.model['texMU'].shape[1]
self.kpt_ind = self.model['kpt_ind']
self.triangles = self.model['tri']
self.full_triangles = np.vstack((self.model['tri'], self.model['tri_mouth']))
# ------------------------------------- shape: represented with mesh(vertices & triangles(fixed))
def get_shape_para(self, type = 'random'):
if type == 'zero':
sp = np.random.zeros((self.n_shape_para, 1))
elif type == 'random':
sp = np.random.rand(self.n_shape_para, 1)*1e04
return sp
def get_exp_para(self, type = 'random'):
if type == 'zero':
ep = np.zeros((self.n_exp_para, 1))
elif type == 'random':
ep = -1.5 + 3*np.random.random([self.n_exp_para, 1])
ep[6:, 0] = 0
return ep
def generate_vertices(self, shape_para, exp_para):
'''
Args:
shape_para: (n_shape_para, 1)
exp_para: (n_exp_para, 1)
Returns:
vertices: (nver, 3)
'''
vertices = self.model['shapeMU'] + self.model['shapePC'].dot(shape_para) + self.model['expPC'].dot(exp_para)
vertices = np.reshape(vertices, [int(3), int(len(vertices)/3)], 'F').T
return vertices
# -------------------------------------- texture: here represented with rgb value(colors) in vertices.
def get_tex_para(self, type = 'random'):
if type == 'zero':
tp = np.zeros((self.n_tex_para, 1))
elif type == 'random':
tp = np.random.rand(self.n_tex_para, 1)
return tp
def generate_colors(self, tex_para):
'''
Args:
tex_para: (n_tex_para, 1)
Returns:
colors: (nver, 3)
'''
colors = self.model['texMU'] + self.model['texPC'].dot(tex_para*self.model['texEV'])
colors = np.reshape(colors, [int(3), int(len(colors)/3)], 'F').T/255.
return colors
# ------------------------------------------- transformation
# ------------- transform
def rotate(self, vertices, angles):
''' rotate face
Args:
vertices: [nver, 3]
angles: [3] x, y, z rotation angle(degree)
x: pitch. positive for looking down
y: yaw. positive for looking left
z: roll. positive for tilting head right
Returns:
vertices: rotated vertices
'''
return mesh.transform.rotate(vertices, angles)
def transform(self, vertices, s, angles, t3d):
R = mesh.transform.angle2matrix(angles)
return mesh.transform.similarity_transform(vertices, s, R, t3d)
def transform_3ddfa(self, vertices, s, angles, t3d): # only used for processing 300W_LP data
R = mesh.transform.angle2matrix_3ddfa(angles)
return mesh.transform.similarity_transform(vertices, s, R, t3d)
# --------------------------------------------------- fitting
def fit(self, x, X_ind, max_iter = 4, isShow = False):
''' fit 3dmm & pose parameters
Args:
x: (n, 2) image points
X_ind: (n,) corresponding Model vertex indices
max_iter: iteration
isShow: whether to reserve middle results for show
Returns:
fitted_sp: (n_sp, 1). shape parameters
fitted_ep: (n_ep, 1). exp parameters
s, angles, t
'''
if isShow:
fitted_sp, fitted_ep, s, R, t = fit.fit_points_for_show(x, X_ind, self.model, n_sp = self.n_shape_para, n_ep = self.n_exp_para, max_iter = max_iter)
angles = np.zeros((R.shape[0], 3))
for i in range(R.shape[0]):
angles[i] = mesh.transform.matrix2angle(R[i])
else:
fitted_sp, fitted_ep, s, R, t = fit.fit_points(x, X_ind, self.model, n_sp = self.n_shape_para, n_ep = self.n_exp_para, max_iter = max_iter)
angles = mesh.transform.matrix2angle(R)
return fitted_sp, fitted_ep, s, angles, t

View File

@ -0,0 +1,18 @@
from __future__ import absolute_import
#from . import bbox
#from . import viz
#from . import random
#from . import metrics
#from . import parallel
from .storage import download, ensure_available, download_onnx
from .filesystem import get_model_dir
from .filesystem import makedirs, try_import_dali
from .constant import *
#from .bbox import bbox_iou
#from .block import recursive_visit, set_lr_mult, freeze_bn
#from .lr_scheduler import LRSequential, LRScheduler
#from .plot_history import TrainingHistory
#from .export_helper import export_block
#from .sync_loader_helper import split_data, split_and_load

View File

@ -0,0 +1,3 @@
DEFAULT_MP_NAME = 'buffalo_l'

View File

@ -0,0 +1,95 @@
"""
This code file mainly comes from https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/download.py
"""
import os
import hashlib
import requests
from tqdm import tqdm
def check_sha1(filename, sha1_hash):
"""Check whether the sha1 hash of the file content matches the expected hash.
Parameters
----------
filename : str
Path to the file.
sha1_hash : str
Expected sha1 hash in hexadecimal digits.
Returns
-------
bool
Whether the file content matches the expected hash.
"""
sha1 = hashlib.sha1()
with open(filename, 'rb') as f:
while True:
data = f.read(1048576)
if not data:
break
sha1.update(data)
sha1_file = sha1.hexdigest()
l = min(len(sha1_file), len(sha1_hash))
return sha1.hexdigest()[0:l] == sha1_hash[0:l]
def download_file(url, path=None, overwrite=False, sha1_hash=None):
"""Download an given URL
Parameters
----------
url : str
URL to download
path : str, optional
Destination path to store downloaded file. By default stores to the
current directory with same name as in url.
overwrite : bool, optional
Whether to overwrite destination file if already exists.
sha1_hash : str, optional
Expected sha1 hash in hexadecimal digits. Will ignore existing file when hash is specified
but doesn't match.
Returns
-------
str
The file path of the downloaded file.
"""
if path is None:
fname = url.split('/')[-1]
else:
path = os.path.expanduser(path)
if os.path.isdir(path):
fname = os.path.join(path, url.split('/')[-1])
else:
fname = path
if overwrite or not os.path.exists(fname) or (
sha1_hash and not check_sha1(fname, sha1_hash)):
dirname = os.path.dirname(os.path.abspath(os.path.expanduser(fname)))
if not os.path.exists(dirname):
os.makedirs(dirname)
print('Downloading %s from %s...' % (fname, url))
r = requests.get(url, stream=True)
if r.status_code != 200:
raise RuntimeError("Failed downloading url %s" % url)
total_length = r.headers.get('content-length')
with open(fname, 'wb') as f:
if total_length is None: # no content length header
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
else:
total_length = int(total_length)
for chunk in tqdm(r.iter_content(chunk_size=1024),
total=int(total_length / 1024. + 0.5),
unit='KB',
unit_scale=False,
dynamic_ncols=True):
f.write(chunk)
if sha1_hash and not check_sha1(fname, sha1_hash):
raise UserWarning('File {} is downloaded but the content hash does not match. ' \
'The repo may be outdated or download may be incomplete. ' \
'If the "repo_url" is overridden, consider switching to ' \
'the default repo.'.format(fname))
return fname

View File

@ -0,0 +1,103 @@
import cv2
import numpy as np
from skimage import transform as trans
arcface_dst = np.array(
[[38.2946, 51.6963], [73.5318, 51.5014], [56.0252, 71.7366],
[41.5493, 92.3655], [70.7299, 92.2041]],
dtype=np.float32)
def estimate_norm(lmk, image_size=112,mode='arcface'):
# assert lmk.shape == (5, 2)
# assert image_size%112==0 or image_size%128==0
if image_size%112==0:
ratio = float(image_size)/112.0
diff_x = 0
else:
ratio = float(image_size)/128.0
diff_x = 8.0*ratio
dst = arcface_dst * ratio
dst[:,0] += diff_x
tform = trans.SimilarityTransform()
tform.estimate(lmk, dst)
M = tform.params[0:2, :]
return M
def norm_crop(img, landmark, image_size=112, mode='arcface'):
M = estimate_norm(landmark, image_size, mode)
warped = cv2.warpAffine(img, M, (image_size, image_size), borderValue=0.0)
return warped
def norm_crop2(img, landmark, image_size=112, mode='arcface'):
M = estimate_norm(landmark, image_size, mode)
warped = cv2.warpAffine(img, M, (image_size, image_size), borderValue=0.0)
return warped, M
def square_crop(im, S):
if im.shape[0] > im.shape[1]:
height = S
width = int(float(im.shape[1]) / im.shape[0] * S)
scale = float(S) / im.shape[0]
else:
width = S
height = int(float(im.shape[0]) / im.shape[1] * S)
scale = float(S) / im.shape[1]
resized_im = cv2.resize(im, (width, height))
det_im = np.zeros((S, S, 3), dtype=np.uint8)
det_im[:resized_im.shape[0], :resized_im.shape[1], :] = resized_im
return det_im, scale
def transform(data, center, output_size, scale, rotation):
scale_ratio = scale
rot = float(rotation) * np.pi / 180.0
#translation = (output_size/2-center[0]*scale_ratio, output_size/2-center[1]*scale_ratio)
t1 = trans.SimilarityTransform(scale=scale_ratio)
cx = center[0] * scale_ratio
cy = center[1] * scale_ratio
t2 = trans.SimilarityTransform(translation=(-1 * cx, -1 * cy))
t3 = trans.SimilarityTransform(rotation=rot)
t4 = trans.SimilarityTransform(translation=(output_size / 2,
output_size / 2))
t = t1 + t2 + t3 + t4
M = t.params[0:2]
cropped = cv2.warpAffine(data,
M, (output_size, output_size),
borderValue=0.0)
return cropped, M
def trans_points2d(pts, M):
new_pts = np.zeros(shape=pts.shape, dtype=np.float32)
for i in range(pts.shape[0]):
pt = pts[i]
new_pt = np.array([pt[0], pt[1], 1.], dtype=np.float32)
new_pt = np.dot(M, new_pt)
#print('new_pt', new_pt.shape, new_pt)
new_pts[i] = new_pt[0:2]
return new_pts
def trans_points3d(pts, M):
scale = np.sqrt(M[0][0] * M[0][0] + M[0][1] * M[0][1])
#print(scale)
new_pts = np.zeros(shape=pts.shape, dtype=np.float32)
for i in range(pts.shape[0]):
pt = pts[i]
new_pt = np.array([pt[0], pt[1], 1.], dtype=np.float32)
new_pt = np.dot(M, new_pt)
#print('new_pt', new_pt.shape, new_pt)
new_pts[i][0:2] = new_pt[0:2]
new_pts[i][2] = pts[i][2] * scale
return new_pts
def trans_points(pts, M):
if pts.shape[1] == 2:
return trans_points2d(pts, M)
else:
return trans_points3d(pts, M)

View File

@ -0,0 +1,157 @@
"""
This code file mainly comes from https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/filesystem.py
"""
import os
import os.path as osp
import errno
def get_model_dir(name, root='~/.insightface'):
root = os.path.expanduser(root)
model_dir = osp.join(root, 'models', name)
return model_dir
def makedirs(path):
"""Create directory recursively if not exists.
Similar to `makedir -p`, you can skip checking existence before this function.
Parameters
----------
path : str
Path of the desired dir
"""
try:
os.makedirs(path)
except OSError as exc:
if exc.errno != errno.EEXIST:
raise
def try_import(package, message=None):
"""Try import specified package, with custom message support.
Parameters
----------
package : str
The name of the targeting package.
message : str, default is None
If not None, this function will raise customized error message when import error is found.
Returns
-------
module if found, raise ImportError otherwise
"""
try:
return __import__(package)
except ImportError as e:
if not message:
raise e
raise ImportError(message)
def try_import_cv2():
"""Try import cv2 at runtime.
Returns
-------
cv2 module if found. Raise ImportError otherwise
"""
msg = "cv2 is required, you can install by package manager, e.g. 'apt-get', \
or `pip install opencv-python --user` (note that this is unofficial PYPI package)."
return try_import('cv2', msg)
def try_import_mmcv():
"""Try import mmcv at runtime.
Returns
-------
mmcv module if found. Raise ImportError otherwise
"""
msg = "mmcv is required, you can install by first `pip install Cython --user` \
and then `pip install mmcv --user` (note that this is unofficial PYPI package)."
return try_import('mmcv', msg)
def try_import_rarfile():
"""Try import rarfile at runtime.
Returns
-------
rarfile module if found. Raise ImportError otherwise
"""
msg = "rarfile is required, you can install by first `sudo apt-get install unrar` \
and then `pip install rarfile --user` (note that this is unofficial PYPI package)."
return try_import('rarfile', msg)
def import_try_install(package, extern_url=None):
"""Try import the specified package.
If the package not installed, try use pip to install and import if success.
Parameters
----------
package : str
The name of the package trying to import.
extern_url : str or None, optional
The external url if package is not hosted on PyPI.
For example, you can install a package using:
"pip install git+http://github.com/user/repo/tarball/master/egginfo=xxx".
In this case, you can pass the url to the extern_url.
Returns
-------
<class 'Module'>
The imported python module.
"""
try:
return __import__(package)
except ImportError:
try:
from pip import main as pipmain
except ImportError:
from pip._internal import main as pipmain
# trying to install package
url = package if extern_url is None else extern_url
pipmain(['install', '--user',
url]) # will raise SystemExit Error if fails
# trying to load again
try:
return __import__(package)
except ImportError:
import sys
import site
user_site = site.getusersitepackages()
if user_site not in sys.path:
sys.path.append(user_site)
return __import__(package)
return __import__(package)
def try_import_dali():
"""Try import NVIDIA DALI at runtime.
"""
try:
dali = __import__('nvidia.dali', fromlist=['pipeline', 'ops', 'types'])
dali.Pipeline = dali.pipeline.Pipeline
except ImportError:
class dali:
class Pipeline:
def __init__(self):
raise NotImplementedError(
"DALI not found, please check if you installed it correctly."
)
return dali

View File

@ -0,0 +1,52 @@
import os
import os.path as osp
import zipfile
from .download import download_file
BASE_REPO_URL = 'https://github.com/deepinsight/insightface/releases/download/v0.7'
def download(sub_dir, name, force=False, root='~/.insightface'):
_root = os.path.expanduser(root)
dir_path = os.path.join(_root, sub_dir, name)
if osp.exists(dir_path) and not force:
return dir_path
print('download_path:', dir_path)
zip_file_path = os.path.join(_root, sub_dir, name + '.zip')
model_url = "%s/%s.zip"%(BASE_REPO_URL, name)
download_file(model_url,
path=zip_file_path,
overwrite=True)
if not os.path.exists(dir_path):
os.makedirs(dir_path)
with zipfile.ZipFile(zip_file_path) as zf:
zf.extractall(dir_path)
#os.remove(zip_file_path)
return dir_path
def ensure_available(sub_dir, name, root='~/.insightface'):
return download(sub_dir, name, force=False, root=root)
def download_onnx(sub_dir, model_file, force=False, root='~/.insightface', download_zip=False):
_root = os.path.expanduser(root)
model_root = osp.join(_root, sub_dir)
new_model_file = osp.join(model_root, model_file)
if osp.exists(new_model_file) and not force:
return new_model_file
if not osp.exists(model_root):
os.makedirs(model_root)
print('download_path:', new_model_file)
if not download_zip:
model_url = "%s/%s"%(BASE_REPO_URL, model_file)
download_file(model_url,
path=new_model_file,
overwrite=True)
else:
model_url = "%s/%s.zip"%(BASE_REPO_URL, model_file)
zip_file_path = new_model_file+".zip"
download_file(model_url,
path=zip_file_path,
overwrite=True)
with zipfile.ZipFile(zip_file_path) as zf:
zf.extractall(model_root)
return new_model_file

View File

@ -0,0 +1,116 @@
import cv2
import math
import numpy as np
from skimage import transform as trans
def transform(data, center, output_size, scale, rotation):
scale_ratio = scale
rot = float(rotation) * np.pi / 180.0
#translation = (output_size/2-center[0]*scale_ratio, output_size/2-center[1]*scale_ratio)
t1 = trans.SimilarityTransform(scale=scale_ratio)
cx = center[0] * scale_ratio
cy = center[1] * scale_ratio
t2 = trans.SimilarityTransform(translation=(-1 * cx, -1 * cy))
t3 = trans.SimilarityTransform(rotation=rot)
t4 = trans.SimilarityTransform(translation=(output_size / 2,
output_size / 2))
t = t1 + t2 + t3 + t4
M = t.params[0:2]
cropped = cv2.warpAffine(data,
M, (output_size, output_size),
borderValue=0.0)
return cropped, M
def trans_points2d(pts, M):
new_pts = np.zeros(shape=pts.shape, dtype=np.float32)
for i in range(pts.shape[0]):
pt = pts[i]
new_pt = np.array([pt[0], pt[1], 1.], dtype=np.float32)
new_pt = np.dot(M, new_pt)
#print('new_pt', new_pt.shape, new_pt)
new_pts[i] = new_pt[0:2]
return new_pts
def trans_points3d(pts, M):
scale = np.sqrt(M[0][0] * M[0][0] + M[0][1] * M[0][1])
#print(scale)
new_pts = np.zeros(shape=pts.shape, dtype=np.float32)
for i in range(pts.shape[0]):
pt = pts[i]
new_pt = np.array([pt[0], pt[1], 1.], dtype=np.float32)
new_pt = np.dot(M, new_pt)
#print('new_pt', new_pt.shape, new_pt)
new_pts[i][0:2] = new_pt[0:2]
new_pts[i][2] = pts[i][2] * scale
return new_pts
def trans_points(pts, M):
if pts.shape[1] == 2:
return trans_points2d(pts, M)
else:
return trans_points3d(pts, M)
def estimate_affine_matrix_3d23d(X, Y):
''' Using least-squares solution
Args:
X: [n, 3]. 3d points(fixed)
Y: [n, 3]. corresponding 3d points(moving). Y = PX
Returns:
P_Affine: (3, 4). Affine camera matrix (the third row is [0, 0, 0, 1]).
'''
X_homo = np.hstack((X, np.ones([X.shape[0],1]))) #n x 4
P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
return P
def P2sRt(P):
''' decompositing camera matrix P
Args:
P: (3, 4). Affine Camera Matrix.
Returns:
s: scale factor.
R: (3, 3). rotation matrix.
t: (3,). translation.
'''
t = P[:, 3]
R1 = P[0:1, :3]
R2 = P[1:2, :3]
s = (np.linalg.norm(R1) + np.linalg.norm(R2))/2.0
r1 = R1/np.linalg.norm(R1)
r2 = R2/np.linalg.norm(R2)
r3 = np.cross(r1, r2)
R = np.concatenate((r1, r2, r3), 0)
return s, R, t
def matrix2angle(R):
''' get three Euler angles from Rotation Matrix
Args:
R: (3,3). rotation matrix
Returns:
x: pitch
y: yaw
z: roll
'''
sy = math.sqrt(R[0,0] * R[0,0] + R[1,0] * R[1,0])
singular = sy < 1e-6
if not singular :
x = math.atan2(R[2,1] , R[2,2])
y = math.atan2(-R[2,0], sy)
z = math.atan2(R[1,0], R[0,0])
else :
x = math.atan2(-R[1,2], R[1,1])
y = math.atan2(-R[2,0], sy)
z = 0
# rx, ry, rz = np.rad2deg(x), np.rad2deg(y), np.rad2deg(z)
rx, ry, rz = x*180/np.pi, y*180/np.pi, z*180/np.pi
return rx, ry, rz

View File

@ -62,7 +62,7 @@ class Unprompted:
@shortcodes.register(shortcode_name, None, preprocess)
def handler(keyword, pargs, kwargs, context):
self.prep_for_shortcode(keyword,pargs,kwargs,context)
self.prep_for_shortcode(keyword, pargs, kwargs, context)
return (self.shortcode_objects[f"{keyword}"].run_atomic(pargs, kwargs, context))
# Normal atomic
@ -70,7 +70,7 @@ class Unprompted:
@shortcodes.register(shortcode_name)
def handler(keyword, pargs, kwargs, context):
self.prep_for_shortcode(keyword,pargs,kwargs,context)
self.prep_for_shortcode(keyword, pargs, kwargs, context)
return (self.shortcode_objects[f"{keyword}"].run_atomic(pargs, kwargs, context))
else:
# Allow shortcode to run before inner content
@ -81,7 +81,7 @@ class Unprompted:
@shortcodes.register(shortcode_name, f"{self.Config.syntax.tag_close}{shortcode_name}", preprocess)
def handler(keyword, pargs, kwargs, context, content):
self.prep_for_shortcode(keyword,pargs,kwargs,context,content)
self.prep_for_shortcode(keyword, pargs, kwargs, context, content)
return (self.shortcode_objects[f"{keyword}"].run_block(pargs, kwargs, context, content))
# Normal block
@ -89,7 +89,7 @@ class Unprompted:
@shortcodes.register(shortcode_name, f"{self.Config.syntax.tag_close}{shortcode_name}")
def handler(keyword, pargs, kwargs, context, content):
self.prep_for_shortcode(keyword,pargs,kwargs,context,content)
self.prep_for_shortcode(keyword, pargs, kwargs, context, content)
return (self.shortcode_objects[f"{keyword}"].run_block(pargs, kwargs, context, content))
# Setup extra routines
@ -107,7 +107,7 @@ class Unprompted:
self.log.info(f"Finished loading in {time.time()-start_time} seconds.")
def __init__(self, base_dir="."):
self.VERSION = "10.6.0"
self.VERSION = "10.7.0"
self.shortcode_modules = {}
self.shortcode_objects = {}
@ -178,7 +178,7 @@ class Unprompted:
def start(self, string, debug=True):
if debug: self.log.debug("Loading global variables...")
for global_var, value in self.Config.globals.__dict__.items():
self.shortcode_user_vars[self.Config.syntax.global_prefix+global_var] = value
self.shortcode_user_vars[self.Config.syntax.global_prefix + global_var] = value
if debug: self.log.debug("Main routine started...")
self.routine = "main"
self.conditional_depth = -1
@ -204,13 +204,13 @@ class Unprompted:
return processed
def process_string(self, string, context=None, cleanup_extra_spaces=None):
if cleanup_extra_spaces==None: cleanup_extra_spaces = self.Config.syntax.cleanup_extra_spaces
if cleanup_extra_spaces == None: cleanup_extra_spaces = self.Config.syntax.cleanup_extra_spaces
self.conditional_depth += 1
if context: self.current_context = context
# First, sanitize contents
string = self.shortcode_parser.parse(self.sanitize_pre(string, self.Config.syntax.sanitize_before), context)
self.conditional_depth = max(0, self.conditional_depth -1)
self.conditional_depth = max(0, self.conditional_depth - 1)
return (self.sanitize_post(string, cleanup_extra_spaces))
def sanitize_pre(self, string, rules_obj, only_remove_last=False):
@ -268,7 +268,7 @@ class Unprompted:
self.kwargs = kwargs
self.context = context
self.content = content
def parse_arg(self, key, default=False, datatype=None, context=None, pargs=None, kwargs=None, arithmetic=True, delimiter=None):
"""Processes the argument, casting it to the correct datatype."""
# Load defaults from the Unprompted object
@ -285,8 +285,8 @@ class Unprompted:
if pargs and key in pargs:
return True
elif kwargs and key in kwargs:
if arithmetic: default = self.parse_advanced(str(kwargs[key]),context)
else: default = self.parse_alt_tags(str(kwargs[key]),context)
if arithmetic: default = self.parse_advanced(str(kwargs[key]), context)
else: default = self.parse_alt_tags(str(kwargs[key]), context)
if delimiter:
try:
# We will cast the value to a string so that we can split it, but
@ -300,7 +300,7 @@ class Unprompted:
try:
if type(default) == list:
for idx,val in enumerate(default):
for idx, val in enumerate(default):
default[idx] = datatype(val)
else:
default = datatype(default)
@ -310,11 +310,10 @@ class Unprompted:
return default
def parse_advanced(self, string, context=None):
"""First runs the string through parse_alt_tags, the result of which then goes through simpleeval"""
if string is None: return ""
if (len(string) < 1): return ""
string = self.parse_alt_tags(string, context)
if self.Config.advanced_expressions:
@ -360,11 +359,11 @@ class Unprompted:
string = string.replace(tmp_start, self.Config.syntax.tag_start_alt).replace(tmp_end, self.Config.syntax.tag_end_alt)
return (parser.parse(string, context))
def make_alt_tags(self, string):
"""Similar to parse_alt_tags, but in reverse; converts square brackets to nested alt tags."""
if string is None or len(string) < 1: return ""
# Find maximum nested depth
nested = 0
while True:
@ -384,7 +383,7 @@ class Unprompted:
end_new = tmp_end * (i + 1)
string = string.replace(start_old, start_new).replace(end_old, end_new)
# Convert primary square bracket tag to alt tag
string = string.replace(self.Config.syntax.tag_start, self.Config.syntax.tag_start_alt).replace(self.Config.syntax.tag_end, self.Config.syntax.tag_end_alt)
@ -453,10 +452,10 @@ class Unprompted:
this_val = self.shortcode_user_vars[att]
# Apply preset model names
if att_split[2] == "model":
if self.shortcode_user_vars["sd_base"]== "sd1": cn_dict = self.Config.stable_diffusion.controlnet.sd1_models
if self.shortcode_user_vars["sd_base"] == "sd1": cn_dict = self.Config.stable_diffusion.controlnet.sd1_models
elif self.shortcode_user_vars["sd_base"] == "sdxl": cn_dict = self.Config.stable_diffusion.controlnet.sdxl_models
if hasattr(cn_dict,this_val):
if hasattr(cn_dict, this_val):
this_val = getattr(cn_dict, this_val)
setattr(all_units[int(att_split[1])], "_".join(att_split[2:]), this_val)
cnet.update_cn_script_in_processing(this_p, all_units)
@ -548,15 +547,18 @@ class Unprompted:
if self.routine == "after":
if new_image:
self.after_processed.images[idx] = new_image
else: return self.after_processed.images[idx]
else:
return self.after_processed.images[idx]
elif "init_images" in self.shortcode_user_vars and self.shortcode_user_vars["init_images"]:
if new_image:
self.shortcode_user_vars["init_images"][idx] = new_image
else: return self.shortcode_user_vars["init_images"][idx]
else:
return self.shortcode_user_vars["init_images"][idx]
elif "default_image" in self.shortcode_user_vars:
if new_image:
self.shortcode_user_vars["default_image"] = new_image
else: return self.shortcode_user_vars["default_image"]
else:
return self.shortcode_user_vars["default_image"]
except Exception as e:
self.log.exception("Could not find the current image.")
return None
@ -570,16 +572,16 @@ class Unprompted:
if self.routine == "after":
self.shortcode_user_vars["init_images"][idx] = self.after_processed.images[idx]
# Update the SD vars if Unprompted.main_p exists
#if hasattr(self, "main_p"):
# self.update_stable_diffusion_vars(self.main_p)
return True
return None
def escape_tags(self, string, new_start = None, new_end = None):
if not new_start: new_start = self.Config.syntax.tag_escape+self.Config.syntax.tag_start_alt
if not new_end: new_end = self.Config.syntax.tag_escape+self.Config.syntax.tag_end_alt
def escape_tags(self, string, new_start=None, new_end=None):
if not new_start: new_start = self.Config.syntax.tag_escape + self.Config.syntax.tag_start_alt
if not new_end: new_end = self.Config.syntax.tag_escape + self.Config.syntax.tag_end_alt
# self.log.warning(f"string is {string}")
# self.log.warning(f"string after replacing is {string.replace(self.Config.syntax.tag_start,new_start).replace(self.Config.syntax.tag_end,new_end)}")
return string.replace(self.Config.syntax.tag_start,new_start).replace(self.Config.syntax.tag_end,new_end)
return string.replace(self.Config.syntax.tag_start, new_start).replace(self.Config.syntax.tag_end, new_end)

View File

@ -8,7 +8,7 @@ color-matcher # [zoom_enhance]
modelscope # [faceswap] face_fusion
tensorflow # [faceswap]
onnx # [faceswap]
onnxruntime # [faceswap]
onnxruntime-gpu -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/ # [faceswap] GPU support (CUDA 12 package)
mxnet -f https://dist.mxnet.io/python/cpu # [faceswap]
albumentations # [faceswap] insightface
pyiqa # [image_info] image quality assessment

View File

@ -20,6 +20,7 @@ from enum import IntEnum, auto
import sys, os, html, random
base_dir = scripts.basedir()
unprompted_dir = str(Path(*Path(base_dir).parts[-2:])).replace("\\", "/")
sys.path.append(base_dir)
# Main object
@ -27,6 +28,7 @@ from lib_unprompted.shared import Unprompted, parse_config
Unprompted = Unprompted(base_dir)
Unprompted.log.debug(f"The `base_dir` is: {base_dir}")
ext_dir = os.path.split(os.path.normpath(base_dir))[1]
if ext_dir == "unprompted":
Unprompted.log.warning("The extension folder must be renamed from unprompted to _unprompted in order to ensure compatibility with other extensions. Please see this A1111 WebUI issue for more details: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/8011")
@ -40,6 +42,13 @@ Unprompted.is_enabled = True
Unprompted.original_prompt = None
Unprompted.original_negative_prompt = ""
if os.path.exists(f"./modules_forge"):
Unprompted.webui = "forge"
else:
Unprompted.webui = "auto1111"
Unprompted.log.debug(f"WebUI type: {Unprompted.webui}")
Unprompted.wizard_template_files = []
Unprompted.wizard_template_names = []
Unprompted.wizard_template_kwargs = []
@ -95,7 +104,7 @@ def wizard_prep_event_listeners(obj):
wizard_set_event_listener(child)
def wizard_generate_template(option, is_img2img, prepend="", append=""):
def wizard_generate_template(option, is_img2img, html_safe=True, prepend="", append=""):
filepath = os.path.relpath(Unprompted.wizard_template_files[option], f"{base_dir}/{Unprompted.Config.template_directory}")
# Remove file extension
filepath = os.path.splitext(filepath)[0]
@ -120,7 +129,9 @@ def wizard_generate_template(option, is_img2img, prepend="", append=""):
this_val = gr_obj.value
if (arg_name == "prompt"): continue
this_val = Unprompted.make_alt_tags(html.escape(str(helpers.autocast(this_val)).replace("\"", "\'"), quote=False))
this_val = str(helpers.autocast(this_val)).replace("\"", "\'")
if html_safe: this_val = html.escape(this_val, quote=False)
this_val = Unprompted.make_alt_tags(this_val)
if " " in this_val: this_val = f"\"{this_val}\"" # Enclose in quotes if necessary
result += f" {arg_name}={this_val}"
@ -139,7 +150,7 @@ def wizard_generate_template(option, is_img2img, prepend="", append=""):
return (prepend + result + append)
def wizard_generate_shortcode(option, is_img2img, prepend="", append=""):
def wizard_generate_shortcode(option, is_img2img, html_safe=True, prepend="", append=""):
if hasattr(Unprompted.shortcode_objects[option], "wizard_prepend"): result = Unprompted.shortcode_objects[option].wizard_prepend
else: result = Unprompted.Config.syntax.tag_start + option
filtered_shortcodes = Unprompted.wizard_groups[WizardModes.SHORTCODES][int(is_img2img)]
@ -182,7 +193,9 @@ def wizard_generate_shortcode(option, is_img2img, prepend="", append=""):
elif (block_name == "number" or block_name == "slider"): result += f" {arg_name}={helpers.autocast(gr_obj.value)}"
elif (block_name == "textbox"):
if len(this_val) > 0: result += f" {arg_name}=\"{this_val}\""
else: result += f" {arg_name}=\"{html.escape(this_val, quote=False)}\""
else:
if html_safe: this_val = html.escape(this_val, quote=False)
result += f" {arg_name}=\"{this_val}\""
except:
pass
@ -256,10 +269,12 @@ def wizard_generate_capture(include_inference, include_prompt, include_neg_promp
def get_local_file_dir(filename=None):
unp_dir = os.path.basename(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
# unp_dir = os.path.basename(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
if filename: filepath = "/" + str(Path(os.path.relpath(filename, f"{base_dir}")).parent)
else: filepath = ""
return (f"file/extensions/{unp_dir}{filepath}")
return (f"file/{unprompted_dir}{filepath}")
def get_markdown(file):
@ -319,6 +334,7 @@ class Scripts(scripts.Script):
promos.append(f'<a href="https://payhip.com/b/hdgNR" target="_blank"><img src="{get_local_file_dir()}/images/promo_box_fantasy.png" class="thumbnail"></a><h1>Create beautiful art for your <strong>Fantasy Card Game</strong></h1><p>Generate a wide variety of creatures and characters in the style of a fantasy card game. Perfect for heroes, animals, monsters, and even crazy hybrids.</p><a href="https://payhip.com/b/hdgNR" target=_blank><button class="gr-button gr-button-lg gr-button-secondary" title="View premium assets for Unprompted">Download Now ➜</button></a>')
promos.append(f'<a href="https://github.com/ThereforeGames/unprompted" target="_blank"><img src="{get_local_file_dir()}/images/promo_github_star.png" class="thumbnail"></a><h1>Give Unprompted a <strong>star</strong> for visibility</h1><p>Most WebUI users have never heard of Unprompted. You can help more people discover it by giving the repo a ⭐ on Github. Thank you for your support!</p><a href="https://github.com/ThereforeGames/unprompted" target=_blank><button class="gr-button gr-button-lg gr-button-secondary" title="View the Unprompted repo">Visit Github ➜</button></a>')
promos.append(f'<a href="https://github.com/sponsors/ThereforeGames" target="_blank"><img src="{get_local_file_dir()}/images/promo_github_sponsor.png" class="thumbnail"></a><h1>Become a Sponsor</h1><p>One of the best ways to support Unprompted is by becoming our Sponsor on Github - sponsors receive access to a private repo containing all of our premium add-ons. <em>(Still setting that up... should be ready soon!)</em></p><a href="https://github.com/sponsors/ThereforeGames" target=_blank><button class="gr-button gr-button-lg gr-button-secondary" title="View the Unprompted repo">Visit Github ➜</button></a>')
promos.append(f'<a href="https://github.com/ThereforeGames/sd-webui-breadcrumbs" target="_blank"><img src="{get_local_file_dir()}/images/promo_breadcrumbs.png" class="thumbnail"></a><h1>Try our new Breadcrumbs extension</h1><p>From the developer of Unprompted comes <strong>sd-webui-breadcrumbs</strong>, an extension designed to improve the WebUI\'s navigation flow. Tedious "menu diving" is a thing of the past!</p><a href="https://github.com/ThereforeGames/sd-webui-breadcrumbs" target=_blank><button class="gr-button gr-button-lg gr-button-secondary" title="View the sd-webui-breadcrumbs repo">Visit Github ➜</button></a>')
with gr.Accordion("🎉 Promo", open=is_open):
plug = gr.HTML(label="plug", elem_id="promo", value=random.choice(promos))
@ -348,7 +364,7 @@ class Scripts(scripts.Script):
if (block_name == "textbox"):
if "_placeholder" in kwargs: this_placeholder = kwargs["_placeholder"]
else: this_placeholder = str(content)
obj = gr.Textbox(label=this_label, max_lines=1, placeholder=this_placeholder, info=_info, show_label=_show_label)
obj = gr.Textbox(label=this_label, lines=int(kwargs["_lines"]) if "_lines" in kwargs else 1, max_lines=int(kwargs["_max_lines"]) if "_max_lines" in kwargs else 1, placeholder=this_placeholder, info=_info, show_label=_show_label)
elif (block_name == "checkbox"):
obj = gr.Checkbox(label=this_label, value=bool(int(content)), info=_info, show_label=_show_label)
elif (block_name == "number"):
@ -651,8 +667,8 @@ class Scripts(scripts.Script):
autoinclude_obj = autoinclude_obj.children[-1]
if (autoinclude_obj.value):
if mode == WizardModes.SHORTCODES: Unprompted.original_prompt = wizard_generate_shortcode(key, is_img2img, "", Unprompted.original_prompt)
elif mode == WizardModes.TEMPLATES: Unprompted.original_prompt = wizard_generate_template(idx, is_img2img, "", Unprompted.original_prompt)
if mode == WizardModes.SHORTCODES: Unprompted.original_prompt = wizard_generate_shortcode(key, is_img2img, False, "", Unprompted.original_prompt)
elif mode == WizardModes.TEMPLATES: Unprompted.original_prompt = wizard_generate_template(idx, is_img2img, False, "", Unprompted.original_prompt)
p.all_prompts[0] = Unprompted.original_prompt # test
p.unprompted_original_prompt = Unprompted.original_prompt

View File

@ -15,7 +15,7 @@ class Shortcode():
# if "batch_indexing" in kwargs: self.batch_indexing = bool(self.Unprompted.parse_advanced(kwargs["batch_indexing"]))
batch_real_index = self.Unprompted.shortcode_user_vars["batch_real_index"] if "batch_real_index" in self.Unprompted.shortcode_user_vars else 0
dupe_index_mode = self.Unprompted.parse_arg("dupe_index_mode","concat")
dupe_index_mode = self.Unprompted.parse_arg("dupe_index_mode", "concat")
# Create list inside of list to house [after] content for this batch number
while batch_real_index >= len(self.after_content):
@ -26,10 +26,10 @@ class Shortcode():
is_new_index = index >= len(self.after_content[batch_real_index])
if is_new_index or dupe_index_mode != "skip":
self.log.debug(f"Queueing up content (Batch #{batch_real_index}, After {index}): {content}")
if is_new_index or dupe_index_mode == "replace":
self.log.debug(f"Replacing content in After routine (index {index})")
helpers.list_set(self.after_content[batch_real_index],index,content,"")
helpers.list_set(self.after_content[batch_real_index], index, content, "")
elif not is_new_index:
if dupe_index_mode == "concat":
self.log.debug(f"Concatenating content to After routine (index {index})")
@ -61,17 +61,21 @@ class Shortcode():
self.log.debug(f"{success_string} Regional Prompter")
elif script_title == "controlnet":
# Update the controlnet script args with a list of 0 units
cn_path = self.Unprompted.extension_path(self.Unprompted.Config.stable_diffusion.controlnet.extension)
if cn_path:
cn_module = helpers.import_file(f"{self.Unprompted.Config.stable_diffusion.controlnet.extension}.internal_controlnet.external_code", f"{cn_path}/internal_controlnet/external_code.py")
cn_module.update_cn_script_in_processing(self.Unprompted.main_p, [])
self.log.debug(f"{success_string} ControlNet")
else:
self.log.error("Could not communicate with ControlNet.")
if self.Unprompted.webui == "auto1111":
cn_path = self.Unprompted.extension_path(self.Unprompted.Config.stable_diffusion.controlnet.extension)
if cn_path:
cn_lib = "internal_controlnet"
# cn_lib = "lib_controlnet"
cn_module = helpers.import_file(f"{self.Unprompted.Config.stable_diffusion.controlnet.extension}.{cn_lib}.external_code", f"{cn_path}/{cn_lib}/external_code.py")
cn_module.update_cn_script_in_processing(self.Unprompted.main_p, [])
self.log.debug(f"{success_string} ControlNet")
else:
self.log.error("Could not communicate with ControlNet.")
pass
except Exception as e:
self.log.exception(f"Exception while trying to bypass an extension: {script_title}")
pass
i += 1
if processed:
@ -95,11 +99,11 @@ class Shortcode():
self.log.info(f"Processing After content for batch {batch_idx}, block {idx}...")
self.log.debug(f"After content: {content}")
self.Unprompted.shortcode_user_vars["after_index"] = idx
self.Unprompted.process_string(content, "after")
self.after_content = []
return (self.Unprompted.after_processed)
return processed

View File

@ -0,0 +1,84 @@
class Shortcode():
def __init__(self, Unprompted):
self.Unprompted = Unprompted
self.description = "Adjusts the black point of the image to maximize contrast."
self.wizard_append = Unprompted.Config.syntax.tag_end + Unprompted.Config.syntax.tag_start + Unprompted.Config.syntax.tag_close + "after" + Unprompted.Config.syntax.tag_end
def run_atomic(self, pargs, kwargs, context):
from PIL import Image, ImageOps, ImageEnhance
import numpy as np
import lib_unprompted.helpers as helpers
image = self.Unprompted.parse_alt_tags(kwargs["file"], context) if "file" in kwargs else self.Unprompted.current_image()
show = self.Unprompted.parse_arg("show", False)
out = self.Unprompted.parse_arg("out", "")
if isinstance(image, str):
try:
image = Image.open(image)
except:
self.log.error(f"Could not open image {image}")
return ""
# Reinterpretation of Photoshop's "Auto Tone"
# Thank you to Gerald Bakker for the following writeup on the algorithm:
# https://geraldbakker.nl/psnumbers/auto-options.html
shadows = np.array(helpers.str_to_rgb(self.Unprompted.parse_arg("shadows", "0,0,0")))
# midtones are only used in other algorithms:
midtones = helpers.str_to_rgb(self.Unprompted.parse_arg("midtones", "128,128,128"))
highlights = np.array(helpers.str_to_rgb(self.Unprompted.parse_arg("highlights", "255,255,255")))
shadow_clip = self.Unprompted.parse_arg("shadow_clip", 0.001)
highlight_clip = self.Unprompted.parse_arg("highlight_clip", 0.001)
# Convert the image to a numpy array
img_array = np.array(image, dtype=np.float32)
def calculate_adjustment_values(hist, total_pixels, clip_percent):
clip_threshold = total_pixels * clip_percent
cumulative_hist = hist.cumsum()
# Find the first and last indices where the cumulative histogram exceeds the clip thresholds
lower_bound_idx = np.where(cumulative_hist > clip_threshold)[0][0]
upper_bound_idx = np.where(cumulative_hist < (total_pixels - clip_threshold))[0][-1]
return lower_bound_idx, upper_bound_idx
# Process each channel (R, G, B) separately
for channel in range(3):
# Calculate the histogram of the current channel
hist, _ = np.histogram(img_array[:, :, channel].flatten(), bins=256, range=[0, 255])
# Total number of pixels
total_pixels = img_array.shape[0] * img_array.shape[1]
# Calculate the adjustment values based on clipping percentages
dark_value, light_value = calculate_adjustment_values(hist, total_pixels, shadow_clip)
_, upper_light_value = calculate_adjustment_values(hist, total_pixels, highlight_clip)
# Adjust light_value using upper_light_value for highlights
light_value = max(light_value, upper_light_value)
# Avoid division by zero
if light_value == dark_value:
continue
# Scale and clip the channel values
img_array[:, :, channel] = (img_array[:, :, channel] - dark_value) * (highlights[channel] - shadows[channel]) / (light_value - dark_value) + shadows[channel]
img_array[:, :, channel] = np.clip(img_array[:, :, channel], 0, 255)
# Make sure the data type is correct for PIL
img_array = np.clip(img_array, 0, 255).astype(np.uint8)
new_image = Image.fromarray(img_array)
if show:
self.Unprompted.after_processed.images.append(image)
if out:
new_image.save(out)
self.Unprompted.current_image(new_image)
def ui(self, gr):
gr.Textbox(label="Path to image (uses SD image by default) 🡢 str")

View File

@ -41,8 +41,6 @@ class Shortcode():
contents = self.Unprompted.shortcode_objects["function"].functions[name]
next_context = name
else:
# self.log.debug(f"{name} is assumed to be a filepath")
file = self.Unprompted.parse_filepath(helpers.str_with_ext(name, self.Unprompted.Config.txt_format), context=context, must_exist=False)
if not os.path.exists(file):
@ -77,4 +75,4 @@ class Shortcode():
def ui(self, gr):
gr.Textbox(label="Function name or filepath 🡢 str", max_lines=1)
gr.Textbox(label="Expected encoding 🡢 _encoding", max_lines=1, value="utf-8")
pass
pass

View File

@ -12,7 +12,9 @@ class Shortcode():
task = self.Unprompted.parse_advanced(kwargs["task"], context) if "task" in kwargs else "text-generation"
do_cache = self.Unprompted.shortcode_var_is_true("cache", pargs, kwargs)
instruction = self.Unprompted.parse_arg("instruction", "")
do_cache = not self.Unprompted.shortcode_var_is_true("unload", pargs, kwargs)
output_key = "generated_text"
if task == "summarization": output_key = "summary_text"
@ -25,7 +27,7 @@ class Shortcode():
model_dir = f"{self.Unprompted.base_dir}/{self.Unprompted.Config.subdirectories.models}/gpt"
model_name = self.Unprompted.parse_advanced(kwargs["model"], context) if "model" in kwargs else "Gustavosta/MagicPrompt-Stable-Diffusion"
model_name = self.Unprompted.parse_advanced(kwargs["model"], context) if "model" in kwargs else "LykosAI/GPT-Prompt-Expansion-Fooocus-v2"
if do_cache and model_name == self.cache_model_name and task == self.cache_task:
tokenizer = self.cache_tokenizer
@ -34,7 +36,7 @@ class Shortcode():
model = model_name
tokenizer = model
if "task" == "text-generation":
if task == "text-generation":
tokenizer = AutoTokenizer.from_pretrained(model, cache_dir=model_dir)
model = AutoModelForCausalLM.from_pretrained(model, cache_dir=model_dir)
@ -48,13 +50,17 @@ class Shortcode():
generator = pipeline(task, model=model, tokenizer=tokenizer, model_kwargs={"cache_dir": model_dir}, device=self.Unprompted.main_p.sd_model.device)
gpt_result = generator(content, min_length=min_length, max_length=max_length, num_return_sequences=num_return_sequences)[0][output_key]
gpt_result = generator(content, min_length=min_length, max_length=max_length, num_return_sequences=num_return_sequences, prefix=instruction)[0][output_key]
if instruction:
gpt_result = gpt_result.replace(instruction, "")
return gpt_result
def ui(self, gr):
gr.Dropdown(label="GPT model 🡢 model", info="The first time you use a model, it will be downloaded to your `unprompted/models/gpt` directory. Each model is approximately between 300MB-1.4GB. Credit to the model author names are included in the dropdown below.", value="Gustavosta/MagicPrompt-Stable-Diffusion", choices=["Gustavosta/MagicPrompt-Stable-Diffusion", "daspartho/prompt-extend", "succinctly/text2image-prompt-generator", "microsoft/Promptist", "AUTOMATIC/promptgen-lexart", "AUTOMATIC/promptgen-majinai-safe", "AUTOMATIC/promptgen-majinai-unsafe", "Gustavosta/MagicPrompt-Dalle", "kmewhort/stable-diffusion-prompt-bolster", "Ar4ikov/gpt2-650k-stable-diffusion-prompt-generator", "Ar4ikov/gpt2-medium-650k-stable-diffusion-prompt-generator", "crumb/bloom-560m-RLHF-SD2-prompter-aesthetic", "Meli/GPT2-Prompt", "DrishtiSharma/StableDiffusion-Prompt-Generator-GPT-Neo-125M", "facebook/bart-large-cnn", "gpt2"])
gr.Dropdown(label="GPT model 🡢 model", info="The first time you use a model, it will be downloaded to your `unprompted/models/gpt` directory. Each model is approximately between 300MB-1.4GB. Credit to the model author names are included in the dropdown below.", value="LykosAI/GPT-Prompt-Expansion-Fooocus-v2", choices=["LykosAI/GPT-Prompt-Expansion-Fooocus-v2", "Gustavosta/MagicPrompt-Stable-Diffusion", "daspartho/prompt-extend", "succinctly/text2image-prompt-generator", "microsoft/Promptist", "AUTOMATIC/promptgen-lexart", "AUTOMATIC/promptgen-majinai-safe", "AUTOMATIC/promptgen-majinai-unsafe", "Gustavosta/MagicPrompt-Dalle", "kmewhort/stable-diffusion-prompt-bolster", "Ar4ikov/gpt2-650k-stable-diffusion-prompt-generator", "Ar4ikov/gpt2-medium-650k-stable-diffusion-prompt-generator", "crumb/bloom-560m-RLHF-SD2-prompter-aesthetic", "Meli/GPT2-Prompt", "DrishtiSharma/StableDiffusion-Prompt-Generator-GPT-Neo-125M", "facebook/bart-large-cnn", "gpt2"])
gr.Text(label="Instruction 🡢 instruction", value="", info="Text to prepend to the content; may help steer the model's output.")
gr.Dropdown(label="Task 🡢 task", info="Not every model is compatible with every task.", value="text-generation", choices=["text-generation", "summarization"])
gr.Number(label="Minimum number of words returned 🡢 min_length", value=1, interactive=True)
gr.Number(label="Maximum number of words returned 🡢 max_length", value=50, interactive=True)
gr.Checkbox(label="Cache the model 🡢 cache")
gr.Checkbox(label="Unload the model from cache after use 🡢 unload")

View File

@ -3,7 +3,7 @@ class Shortcode():
self.Unprompted = Unprompted
self.description = "Swap the face in an image using one or more techniques. Note that the Facelift template is more user-friendly for this purpose."
self.fs_pipelines = ["face_fusion","ghost","insightface"]
self.fs_pipelines = ["face_fusion", "ghost", "insightface"]
self.fs_now = ""
self.fs_pipeline = {}
for pipeline in self.fs_pipelines:
@ -19,48 +19,61 @@ class Shortcode():
import lib_unprompted.helpers as helpers
from PIL import Image
visibility = self.Unprompted.parse_arg("visibility",1.0)
unload_parts = self.Unprompted.parse_arg("unload","")
minimum_similarity = self.Unprompted.parse_arg("minimum_similarity",-1000.0)
visibility = self.Unprompted.parse_arg("visibility", 1.0)
unload_parts = self.Unprompted.parse_arg("unload", "")
minimum_similarity = self.Unprompted.parse_arg("minimum_similarity", -1000.0)
prefer_gpu = self.Unprompted.parse_arg("prefer_gpu", True)
if len(pargs) < 1:
self.log.error("You must pass a path to a face image as the first parg.")
return ""
all_pipelines = helpers.ensure(self.Unprompted.parse_arg("pipeline","insightface"),list)
all_pipelines = helpers.ensure(self.Unprompted.parse_arg("pipeline", "insightface"), list)
# (kwargs["pipeline"] if "pipeline" in kwargs else "insightface").split(self.Unprompted.Config.syntax.delimiter)
providers = ["CPUExecutionProvider"]
providers = ["CUDAExecutionProvider" if prefer_gpu else "CPUExecutionProvider"]
model_dir = f"{self.Unprompted.base_dir}/{self.Unprompted.Config.subdirectories.models}"
_body = self.Unprompted.parse_alt_tags(kwargs["body"],context) if "body" in kwargs else False
_body = self.Unprompted.parse_alt_tags(kwargs["body"], context) if "body" in kwargs else False
if _body:
orig_img = Image.open(_body)
else: orig_img = self.Unprompted.current_image()
else:
orig_img = self.Unprompted.current_image()
face_string = self.Unprompted.parse_advanced(pargs[0])
faces = face_string.split(self.Unprompted.Config.syntax.delimiter)
def get_cached(part):
if part in self.fs_pipeline[self.fs_now] and part not in unload_parts and "all" not in unload_parts:
if part in self.fs_pipeline[self.fs_now] and part not in unload_parts and "all" not in unload_parts and "export_embedding" not in pargs:
self.log.info(f"Using cached {part}.")
return self.fs_pipeline[self.fs_now][part]
self.log.info(f"Processing {part}...")
return False
for swap_method in all_pipelines:
result = None
self.log.info(f"Starting faceswap: {swap_method}")
self.fs_now = swap_method
gender_bonus = self.Unprompted.parse_arg("gender_bonus", 50)
age_influence = self.Unprompted.parse_arg("age_influence", 1)
if swap_method == "insightface":
import lib_unprompted.insightface as insightface
if prefer_gpu:
import lib_unprompted.insightface_cuda as insightface
else:
import lib_unprompted.insightface as insightface
import numpy as np
import cv2
import torch
def get_faces(img_data: np.ndarray, face_index=0, det_size=(640, 640)):
face_analyser = get_cached("analyser")
if not face_analyser:
face_analyser = insightface.app.FaceAnalysis(name="buffalo_l", providers=providers)
self.fs_pipeline[swap_method]["analyser"] = face_analyser
def get_faces(img_data: np.ndarray, face_index=0, det_size=(640, 640)):
face_analyser.prepare(ctx_id=0, det_size=det_size)
face = face_analyser.get(img_data)
@ -74,7 +87,7 @@ class Shortcode():
return None
these_faces = (self.fs_face_path == face_string) and get_cached("face")
if not these_faces:
if not these_faces:
temp_dict = []
for facepath in faces:
# Avoid reloading faces that were already in self.fs_face_path
@ -87,7 +100,7 @@ class Shortcode():
from safetensors.torch import load_file
tensors = load_file(facepath)
embedding = tensors["embedding"].numpy()
face = insightface.app.common.Face(embedding=embedding)
face = insightface.app.common.Face(embedding=embedding, gender=tensors["gender"] if "gender" in tensors else 0, age=tensors["age"] if "age" in tensors else 18)
except:
self.log.error(f"Could not parse face from the safetensors file at {facepath}.")
continue
@ -100,17 +113,19 @@ class Shortcode():
temp_dict.append(face)
self.fs_pipeline[swap_method]["face"] = temp_dict
if "export_embedding" in pargs:
import os
from safetensors.torch import save_file
self.log.info("Blending faces together...")
avg_embedding = np.mean([obj.embedding for obj in temp_dict], axis=0)
face = insightface.app.common.Face(embedding=avg_embedding)
avg_gender = int(np.mean([obj.gender for obj in temp_dict], axis=0))
avg_age = int(np.mean([obj.age for obj in temp_dict], axis=0))
face = insightface.app.common.Face(embedding=avg_embedding, gender=avg_gender, age=avg_age)
self.fs_pipeline[swap_method]["face"] = [face]
embedding_str = self.Unprompted.parse_arg("embedding_path","blended_faces")
embedding_str = self.Unprompted.parse_arg("embedding_path", "blended_faces")
embedding_path = self.Unprompted.parse_filepath(helpers.str_with_ext(embedding_str, ".safetensors"), context=context, must_exist=False, root=self.Unprompted.base_dir + "/user/faces")
os.makedirs(os.path.dirname(embedding_path), exist_ok=True)
# If embedding file already exists, increment the filename until it doesn't
@ -121,9 +136,8 @@ class Shortcode():
dupe_counter += 1
self.log.info(f"Exporting to {embedding_path}...")
tensors = {"embedding": torch.tensor(face["embedding"])}
save_file(tensors, embedding_path)
tensors = {"embedding": torch.tensor(face["embedding"]), "gender": torch.tensor(face["gender"]), "age": torch.tensor(face["age"])}
save_file(tensors, embedding_path)
target_img = cv2.cvtColor(np.array(orig_img), cv2.COLOR_RGB2BGR)
@ -132,7 +146,7 @@ class Shortcode():
this_model = get_cached("model")
if not this_model:
if not helpers.download_file(f"{model_dir}/insightface/inswapper_128.onnx","https://github.com/facefusion/facefusion-assets/releases/download/models/inswapper_128.onnx"):
if not helpers.download_file(f"{model_dir}/insightface/inswapper_128.onnx", "https://github.com/facefusion/facefusion-assets/releases/download/models/inswapper_128.onnx"):
continue
model_path = f"{model_dir}/insightface/inswapper_128.onnx"
self.fs_pipeline[swap_method]["model"] = insightface.model_zoo.get_model(model_path, providers=providers)
@ -142,36 +156,49 @@ class Shortcode():
for source_idx, source_face in enumerate(self.fs_pipeline[swap_method]["face"]):
self.log.debug(f"Seeking swap target for new face #{source_idx}")
similarities = [None]*len(target_faces)
similarities = [None] * len(target_faces)
for idx, target_face in enumerate(target_faces):
# TODO: Utilize target_face.pose for similarity check?
# For each face, find the most similar face in the source image and swap it in.
if target_face.embedding is not None:
# Find the most similar face in the source image
similarity = np.dot(
source_face.embedding,
target_face.embedding,
source_face.embedding,
target_face.embedding,
)
if gender_bonus:
self.log.debug(f"Source gender is {source_face.gender}, target face #{idx} gender is {target_face.gender}")
if source_face.gender == target_face.gender:
similarity += gender_bonus
if age_influence:
self.log.debug(f"Source age is {source_face.age}, target face #{idx} age is {target_face.age}")
age_diff = abs(source_face.age - target_face.age)
similarity -= age_diff * age_influence
self.log.debug(f"Similarity of face #{idx}: {similarity}")
similarities[idx] = similarity
highest_similarity = max(similarities)
if highest_similarity >= minimum_similarity:
most_similar_idx = similarities.index(max(similarities))
result = self.fs_pipeline[swap_method]["model"].get(
result,
target_faces[most_similar_idx],
source_face,
)
result,
target_faces[most_similar_idx],
source_face,
)
# Remove this target face to avoid swapping it with the remaining images
target_faces.pop(most_similar_idx)
# Break out of the source_face loop in case there are no more target faces
if not target_faces: break
else:
self.log.info("No faces met the minimum similarity threshold.")
result = Image.fromarray(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
else:
self.log.error(f"No target face detected.")
@ -208,8 +235,8 @@ class Shortcode():
from lib_unprompted.ghost.models.config_sr import TestOptions
# Prep default args
kwargs["G_path"] = self.Unprompted.parse_arg("G_path",f"{model_dir}/ghost/G_unet_2blocks.pth")
kwargs["backbone"] = self.Unprompted.parse_arg("backbone","unet")
kwargs["G_path"] = self.Unprompted.parse_arg("G_path", f"{model_dir}/ghost/G_unet_2blocks.pth")
kwargs["backbone"] = self.Unprompted.parse_arg("backbone", "unet")
kwargs["num_blocks"] = self.Unprompted.parse_arg("num_blocks", 2)
kwargs["batch_size"] = self.Unprompted.parse_arg("batch_size", 40)
kwargs["crop_size"] = self.Unprompted.parse_arg("crop_size", 224)
@ -236,29 +263,29 @@ class Shortcode():
model = self.fs_pipeline[swap_method]["model"]["model"]
else:
# process downloads
helpers.download_file(f"{model_dir}/ghost/antelope/glintr100.onnx","https://github.com/sberbank-ai/sber-swap/releases/download/antelope/glintr100.onnx")
helpers.download_file(f"{model_dir}/ghost/antelope/scrfd_10g_bnkps.onnx","https://github.com/sberbank-ai/sber-swap/releases/download/antelope/scrfd_10g_bnkps.onnx")
helpers.download_file(f"{model_dir}/ghost/backbone.pth","https://github.com/sberbank-ai/sber-swap/releases/download/arcface/backbone.pth")
helpers.download_file(f"{model_dir}/ghost/G_unet_2blocks.pth","https://github.com/sberbank-ai/sber-swap/releases/download/sber-swap-v2.0/G_unet_2blocks.pth")
helpers.download_file(f"{model_dir}/ghost/antelope/glintr100.onnx", "https://github.com/sberbank-ai/sber-swap/releases/download/antelope/glintr100.onnx")
helpers.download_file(f"{model_dir}/ghost/antelope/scrfd_10g_bnkps.onnx", "https://github.com/sberbank-ai/sber-swap/releases/download/antelope/scrfd_10g_bnkps.onnx")
helpers.download_file(f"{model_dir}/ghost/backbone.pth", "https://github.com/sberbank-ai/sber-swap/releases/download/arcface/backbone.pth")
helpers.download_file(f"{model_dir}/ghost/G_unet_2blocks.pth", "https://github.com/sberbank-ai/sber-swap/releases/download/sber-swap-v2.0/G_unet_2blocks.pth")
# model for face cropping
app = Face_detect_crop(name="antelope", root=f"{model_dir}/ghost")
app.prepare(ctx_id= 0, det_thresh=0.6, det_size=(640,640))
app.prepare(ctx_id=0, det_thresh=0.6, det_size=(640, 640))
# main model for generation
G = AEI_Net(args.backbone, num_blocks=args.num_blocks, c_id=512)
G.eval()
G.load_state_dict(torch.load(args.G_path, map_location=torch.device('cpu')))
G.load_state_dict(torch.load(args.G_path, map_location=torch.device("cuda" if prefer_gpu else "cpu")))
G = G.cuda()
G = G.half()
# arcface model to get face embedding
netArc = iresnet100(fp16=False)
netArc.load_state_dict(torch.load(f'{model_dir}/ghost/backbone.pth'))
netArc=netArc.cuda()
netArc = netArc.cuda()
netArc.eval()
# model to get face landmarks
# model to get face landmarks
handler = Handler(f'{self.Unprompted.base_dir}/lib_unprompted/ghost/coordinate_reg/model/2d106det', 0, root=f"{model_dir}/ghost", ctx_id=0, det_size=640)
# model to make superres of face, set use_sr=True if you want to use super resolution or use_sr=False if you don't
@ -271,44 +298,44 @@ class Shortcode():
model.netG.train()
else:
model = None
self.fs_pipeline[swap_method]["model"] = {}
self.fs_pipeline[swap_method]["model"]["app"] = app
self.fs_pipeline[swap_method]["model"]["G"] = G
self.fs_pipeline[swap_method]["model"]["netArc"] = netArc
self.fs_pipeline[swap_method]["model"]["handler"] = handler
self.fs_pipeline[swap_method]["model"]["model"] = model
return app, G, netArc, handler, model
app, G, netArc, handler, model = init_models(args)
# get crops from source images
# print('List of source paths: ',args.source_paths)
source = []
try:
for source_img in args.source_paths:
for source_img in args.source_paths:
img = cv2.imread(source_img)
img = crop_face(img, app, args.crop_size)[0]
source.append(img[:, :, ::-1])
except TypeError:
self.log.error("Could not parse face from the image in given filepath.")
return ""
target_full = helpers.pil_to_cv2(orig_img)
full_frames = [target_full]
# get target faces that are used for swap
set_target = True
target = [crop_face(target_full, app, args.crop_size)[0]]
# start = time.time()
final_frames_list, crop_frames_list, full_frames, tfm_array_list = model_inference(full_frames, source, target, netArc, G, app, set_target, similarity_th=args.similarity_th, crop_size=args.crop_size, BS=args.batch_size)
result = get_final_image(final_frames_list, crop_frames_list, full_frames[0], tfm_array_list, handler)
result = Image.fromarray(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
result = Image.fromarray(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
# TODO: SimSwap pipeline does not play well with WebUI torch load functions e.g.
# ModuleNotFoundError: No module named 'models.arcface_models'
elif swap_method == "simswap":
@ -319,9 +346,9 @@ class Shortcode():
# import sys
# def add_path(path):
# if path not in sys.path:
# sys.path.insert(0, path)
# sys.path.insert(0, path)
# path = osp.join(self.Unprompted.base_dir, "lib_unprompted/simswap/models")
# add_path(path)
# add_path(path)
from torchvision import transforms
from lib_unprompted.simswap.insightface_func.face_detect_crop_single import Face_detect_crop
@ -466,7 +493,7 @@ class Shortcode():
else:
net = None
result = reverse2wholeimage(b_align_crop_tenor_list, swap_result_list, b_mat_list, crop_size, img_b_whole, None, None, True, pasring_model =net,use_mask=opt.use_mask, norm = spNorm)
result = reverse2wholeimage(b_align_crop_tenor_list, swap_result_list, b_mat_list, crop_size, img_b_whole, None, None, True, pasring_model=net, use_mask=opt.use_mask, norm=spNorm)
# Append to output window
try:
@ -479,15 +506,18 @@ class Shortcode():
self.fs_pipeline[swap_method].pop(part, None)
if "face" in unload_parts: self.fs_face_path = None
else: self.fs_face_path = face_string
return ""
def ui(self, gr):
with gr.Row():
gr.Image(label="New face image(s) to swap to 🡢 str",type="filepath",interactive=True)
gr.Image(label="Body image to perform swap on (defaults to SD output) 🡢 body",type="filepath",interactive=True)
gr.Image(label="New face image(s) to swap to 🡢 str", type="filepath", interactive=True)
gr.Image(label="Body image to perform swap on (defaults to SD output) 🡢 body", type="filepath", interactive=True)
gr.Dropdown(label="Faceswap pipeline(s) 🡢 pipeline", choices=self.fs_pipelines, value="insightface", multiselect=True, interactive=True, info="You can enable multiple pipelines with the standard delimiter. Please note that each pipeline must download its models on first use.")
gr.Checkbox(label="Export all faces as a blended safetensors embedding 🡢 export_embedding",value=False)
gr.Textbox(label="Path to save the exported embedding 🡢 embedding_path",placeholder="unprompted/user/faces/blended_faces.safetensors",interactive=True)
gr.Slider(label="Gender bonus 🡢 gender_bonus", value=50, maximum=1000, minimum=0, interactive=True, step=1)
gr.Slider(label="Age influence multiplier 🡢 age_influence", value=1, maximum=100, minimum=0, interactive=True, step=1)
gr.Checkbox(label="Export all faces as a blended safetensors embedding 🡢 export_embedding", value=False)
gr.Textbox(label="Path to save the exported embedding 🡢 embedding_path", placeholder="unprompted/user/faces/blended_faces.safetensors", interactive=True)
gr.Slider(label="Visibility 🡢 visibility", value=1.0, maximum=1.0, minimum=0.0, interactive=True, step=0.01)
gr.Dropdown(label="Unload pipeline parts from cache 🡢 unload", choices=["all","face","model"],multiselect=True,interactive=True,info="You can release some or all of the pipeline parts from your cache after inference. Useful for low-memory devices.")
gr.Checkbox(label="Prefer GPU 🡢 prefer_gpu", value=True, interactive=True)
gr.Dropdown(label="Unload pipeline parts from cache 🡢 unload", choices=["all", "face", "model","analyser"], multiselect=True, interactive=True, info="You can release some or all of the pipeline parts from your cache after inference. Useful for low-memory devices.")

View File

@ -3,4 +3,4 @@ A decent starting point to upscale images using the Tile model for ControlNet.
Ideally, you should mask out the face and run the result through Facelift.
Best ESRGAN model I'm aware of: 4x_RealisticRescaler_100000_G
[/##]
[if batch_real_index=0][sets sampler="Restart" steps=20 denoising_strength=0.25 cfg_scale=15 cn_0_enabled=1 cn_0_model=ip-adapter-plus-face_sd15 cn_0_module=ip-adapter_clip_sd15 cn_0_weight=0.5 cn_0_pixel_perfect=0 negative_prompt="rfneg UnrealisticDream BadDream BeyondV3-neg" cn_1_enabled=1 cn_1_module=inpaint_only cn_1_model=inpaint cn_1_weight=1.0 cn_1_guidance_end=1.0 cn_1_control_mode=2], best quality (worst quality:-1)[/if]
[if batch_real_index=0][sets sampler="Restart" steps=20 denoising_strength=0.25 cfg_scale=15 negative_prompt="rfneg UnrealisticDream BadDream BeyondV3-neg" cn_0_enabled=1 cn_0_model=ip-adapter-plus-face_sd15 cn_0_module=ip-adapter_clip_sd15 cn_0_weight=0.5 cn_0_pixel_perfect=0 cn_1_enabled=1 cn_1_module=inpaint_only cn_1_model=inpaint cn_1_weight=1.0 cn_1_guidance_end=1.0 cn_1_control_mode=2], best quality (worst quality:-1)[/if]

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

View File

@ -1,4 +1,6 @@
[template name="Facelift v0.1.1"]
![Preview]([base_dir]/facelift.png)
An all-in-one solution for performing faceswaps by combining different models and postprocessing techniques.
[/template]
[wizard row]

Binary file not shown.

After

Width:  |  Height:  |  Size: 105 KiB

View File

@ -0,0 +1,79 @@
[template name="Magic Spice v0.0.1"]
![Preview]([base_dir]/magic_spice.png)
This template elevates your prompts using techniques from Fooocus and elsewhere. It helps ensure high-quality images regardless of the simplicity of your prompt. **Some spices may yield NSFW terms due to GPT-2 prompt expansion.**
<details><summary>📚 Documentation</summary>
<details><summary>What is a "spice?"</summary>
A spice is a prompt template that uses a set of techniques to enhance the quality of the generated image. It can include anything from adding extra networks to using a negative prompt to using fluff terms.
</details>
<details><summary>Model compatibility</summary>
Spices are model-agnostic, meaning they are compatible with both Stable Diffusion 1.5 and SDXL checkpoints. Some settings such as the aspect ratio are automatically adjusted based on the architecture you're using.
</details>
<details><summary>Quality vs adherence</summary>
Optimizing for quality means that the model will try to generate the best possible image, even if it doesn't strictly adhere to the prompt. This can be useful for prompts that are too simple or too complex. However, if the spice strays too far from your intentions, try disabling GPT-2 prompt expansion and the use of negative prompts.
</details>
</details>
[/template]
[set subject _new _label="Subject" _info="Enter a prompt to enhance." _max_lines=20 _lines=3]Statue of God[/set]
[set style_preset _new _info="May download extra dependencies on first use." _ui="dropdown" _choices="none|{filelist '%BASE_DIR%/templates/common/presets/magic_spice/*.*' _basename _hide_ext}" _label="Choose Your Spice"]allspice_v1[/set]
[set aspect_ratio _new _ui="radio" _choices="■ Square|↕️ Portrait|↔️ Landscape|Custom"]■ Square[/set]
[wizard accordion _label="⚙️ Advanced Settings"]
[set inference_preset _new _info="Locks CFG scale, sampler method, etc. to recommended values" _label="Inference Preset" _ui="dropdown" _choices="none|{filelist '%BASE_DIR%/templates/common/presets/txt2img/*.*' _basename _hide_ext}"]restart_v1[/set]
[set do_fluff _new _label="Use fluff terms" _ui="checkbox"]1[/set]
[set do_gpt _new _label="Use GPT-2 prompt expansion" _ui="checkbox"]1[/set]
[set do_networks _new _label="Use extra networks" _ui="checkbox"]1[/set]
[set do_negatives _new _label="Use negative prompt" _ui="checkbox"]1[/set]
[set do_autotone _new _label="Fix contrast issues" _ui="checkbox"]1[/set]
[/wizard]
[if "style_preset != 'none'"]
[call "common/presets/magic_spice/{get style_preset}"]
[/if]
[else]
[get subject]
[/else]
[if "inference_preset != 'none'"]
[call "common/presets/txt2img/{get inference_preset}"]
[/if]
[if sd_base="sdxl"]
[switch aspect_ratio]
[case "■ Square"]
[sets width=1024 height=1024]
[/case]
[case "↕️ Portrait"]
[sets width=768 height=1344]
[/case]
[case "↔️ Landscape"]
[sets width=1344 height=768]
[/case]
[/switch]
[/if][else]
[switch aspect_ratio]
[case "■ Square"]
[sets width=512 height=512]
[/case]
[case "↕️ Portrait"]
[sets width=512 height=768]
[/case]
[case "↔️ Landscape"]
[sets width=768 height=512]
[/case]
[/switch]
[/else]
[if do_autotone]
[after][autotone][/after]
[/if]

View File

@ -1,11 +0,0 @@
[if "inference_preset == 'none'"]
[logs "Applying optimal inference settings for the Vivarium preset..."]
[sets cfg_scale=7.5 sampler_name="Euler" steps=20 denoising_strength=1.0 mask_blur=2 mask_blur_x=2 mask_blur_y=2 inpaint_full_res=1 inpaint_full_res_padding=0 interrogate=0 mask_method=none]
[img2img_autosize][civitai lora "epiCRealismHelper" 0.5 _id=110334 _debug][civitai lora "SimplePositive_v1_AutoRunMech" _mvid=159384]
[set negative_prompt _append]
([civitai embedding "rfneg" _id="120412"] [civitai embedding "UnrealisticDream" 1.0 _mvid="77173"] [civitai embedding "BadDream" _mvid="77169"] [civitai embedding "BeyondNegativev2-neg" _mvid="119407"]::0.95)
[/set]
[/if]
[else]
[logs "For optimal results with Vivarium, set your inference_preset to none" _level="warning"]
[/else]

View File

@ -1 +1 @@
[restore_faces unload="{get unload}" method=gfpgan image="{get body}"][faceswap "{get faces}" unload="{get unload_all}" visibility=0.75][restore_faces unload="{get unload}" method=gpen][upscale models="TGHQFace8x_500k|4xFaceUpSharpLDAT|4x-UltraSharp|R-ESRGAN 4x+" scale=1 limit=1 visibility=0.8 keep_res]
[restore_faces unload="{get unload}" method=gfpgan image="{get body}"][faceswap "{get faces}" unload="{get unload_all}" visibility=0.75][restore_faces unload="{get unload}" method=gpen]

View File

@ -1 +1 @@
[restore_faces method=codeformer image="{get body}"][faceswap "{get face}" unload="{get unload_all}" pipeline="ghost" ][call common/presets/txt2img/restart_v1][zoom_enhance replacement="face, best quality, hdr"][faceswap "{get face}" unload="{get unload_all}"][restore_faces unload="{get unload}"]
[faceswap "{get faces}" unload="{get unload_all}" visibility=0.75][restore_faces unload="{get unload}" method=gpen][upscale models="TGHQFace8x_500k|4xFaceUpSharpLDAT|4x-UltraSharp|R-ESRGAN 4x+" scale=1 limit=1 visibility=0.8 keep_res]

View File

@ -1 +1 @@
[faceswap "{get faces}" unload="{get unload_all}" body="{get body}"][restore_faces unload="{get unload}"]
[faceswap "{get faces}" unload="{get unload_all}" body="{get body}"][restore_faces unload="{get unload}" method=gpen]

View File

@ -1 +0,0 @@
[sets cfg_scale=4 sampler_name="DPM++ SDE" steps=20 denoising_strength=0.67 mask_blur=0]

View File

@ -0,0 +1,23 @@
[set fluff]best quality, high detail[/set]
[if do_gpt]
[gpt max_length=300]([get subject]:1.1), [if do_fluff][get fluff][/if] BREAK [/gpt]
[/if]
[else][get subject] [if do_fluff], [get fluff][/if][/else]
[if sd_base="sdxl"]
[if do_networks]
[civitai _file="sdxl_offset_example_v10" _id=137511 _weight=0.5]
[/if]
[if do_negatives]
[set negative_prompt _append]text, watermark, low-quality, signature, moiré pattern, downsampling, aliasing, distorted, blurry, glossy, blur, jpeg artifacts, compression artifacts, poorly drawn, low-resolution, bad, distortion, twisted, excessive, exaggerated pose, exaggerated limbs, grainy, symmetrical, duplicate, error, pattern, beginner, pixelated, fake, hyper, glitch, overexposed, high-contrast, bad-contrast[/set]
[/if]
[/if]
[else]
[if do_negatives]
[if do_networks]
[set negative_prompt _append][worst quality:worst quality, deviantart, [civitai _file=badhandv4 _id=16993], [civitai _file=rfneg _id=120412], ([civitai _file=UnrealisticDream _id=72437 _mid=77173]:1.2) [civitai _file=BeyondV4-neg _id=108821], [civitai _file=difConsistency_negative_v2 _id=87375], [civitai _file=epiCPhotoGasm-colorfulPhoto-neg _id=132719], [civitai _file=PA7_UnRealistic-Neg_v2-neg _id=208852 _mid=235232], [civitai _file=BadDream _id=72437] [civitai _file=realisticvision-negative-embedding _id=36070]:3][/set]
[/if]
[else]
[set negative_prompt _append]worst quality[/set]
[/else]
[/if]
[/else]

View File

@ -0,0 +1,13 @@
[set fluff]anime screencap BREAK score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up, official art[/set]
[if do_gpt]
[gpt max_length=150 model="FredZhang7/distilgpt2-stable-diffusion-v2"][get subject], [if do_fluff], [get fluff][/if][/gpt]
[/if]
[else][get subject][/else][if do_fluff], [get fluff][/if]
[if do_networks]
[if sd_base="sdxl"]
[civitai _file="sdxl_offset_example_v10" _id=137511 _weight=0.5]
[/if]
[/if]
[if do_negatives]
[set negative_prompt _append]line art, watermark, logo, (worst quality:1.5), (low quality:1.5), (normal quality:1.5), lowres, bad anatomy, bad hands, multiple eyebrow, (cropped), extra limb, missing limbs, deformed hands, long neck, long body, (bad hands), signature, username, artist name, conjoined fingers, deformed fingers, error,(deformed|distorted|disfigured:1.21), poorly drawn, bad anatomy, wrong anatomy, mutation, mutated, (mutated hands AND fingers:1.21), bad hands, bad fingers, loss of a limb, extra limb, missing limb, floating limbs, amputation, deformed, black and white, disfigured, low contrast[/set]
[/if]

View File

@ -0,0 +1,25 @@
[set fluff]BREAK high detail RAW photo, colorful, best quality, 4k resolution, professional photography, extremely detailed, film grain[/set]
[if do_gpt]
[gpt max_length=150 model="daspartho/prompt-extend"][get subject] [if do_fluff][get fluff][/if] BREAK [/gpt]
[/if][else][get subject] [get fluff], [/else]
[if sd_base="sdxl"]
[if do_networks]
[civitai _file="RMSDXL_Photo" _id=250381 _weight=1.0][civitai _file="sdxl_offset_example_v10" _id=137511 _weight=0.5]
[/if]
[if do_negatives]
[set negative_prompt _append]text, watermark, low-quality, signature, moiré pattern, downsampling, aliasing, distorted, blurry, glossy, blur, jpeg artifacts, compression artifacts, poorly drawn, low-resolution, bad, distortion, twisted, excessive, exaggerated pose, exaggerated limbs, grainy, symmetrical, duplicate, error, pattern, beginner, pixelated, fake, hyper, glitch, overexposed, high-contrast, bad-contrast[/set]
[/if]
[/if]
[else]
[if do_networks]
[civitai lora "difConsistency_detail" 0.2 _id=87378]
[/if]
[if do_negatives]
[if do_networks]
[set negative_prompt _append][worst quality:worst quality, deviantart, [civitai _file=badhandv4 _id=16993], [civitai _file=rfneg _id=120412], ([civitai _file=UnrealisticDream _id=72437 _mid=77173]:1.2) [civitai _file=BeyondV4-neg _id=108821], [civitai _file=difConsistency_negative_v2 _id=87375], [civitai _file=epiCPhotoGasm-colorfulPhoto-neg _id=132719], [civitai _file=PA7_UnRealistic-Neg_v2-neg _id=208852 _mid=235232], [civitai _file=BadDream _id=72437] [civitai _file=realisticvision-negative-embedding _id=36070]:3][/set]
[/if]
[else]
[set negative_prompt _append]worst quality[/set]
[/else]
[/if]
[/else]

View File

@ -0,0 +1,2 @@
[# Note: This preset is only compatible with WebUI Forge and assumes that the Lightning lora has been merged into the active model.]
[sets cfg_scale=2 sampler_name="DPM++ 2M SDE SGMUniform" steps=6]

View File

@ -0,0 +1,2 @@
[# Note: This preset is only compatible with WebUI Forge and requires the SDXL Lightning 8-step LORA.]
<lora:sdxl_lightning_8step_lora:1.0>[sets cfg_scale=2 sampler_name="DPM++ 2M SDE SGMUniform" steps=8]

View File

@ -0,0 +1 @@
[sets cfg_scale=7.5 sampler_name="Restart" steps=12]