From 71a18dcf7470accfaa16456328290388e49f3428 Mon Sep 17 00:00:00 2001 From: Vladimir Mandic Date: Tue, 8 Jul 2025 16:16:08 -0400 Subject: [PATCH] update precommit hooks Signed-off-by: Vladimir Mandic --- .pre-commit-config.yaml | 13 +++- SECURITY.md | 4 +- .../submodules/efficientnet_repo/README.md | 18 +++--- .../models/base_models/midas_repo/README.md | 60 +++++++++---------- 4 files changed, 52 insertions(+), 43 deletions(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index d2793d1ca..620c2c730 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -25,15 +25,24 @@ repos: - id: check-case-conflict - id: check-merge-conflict - id: check-symlinks - - id: check-yaml + - id: check-illegal-windows-names + - id: check-merge-conflict + - id: detect-private-key - id: check-builtin-literals - id: check-case-conflict - - id: check-json - id: check-symlinks + - id: check-yaml + - id: check-json - id: check-toml - id: check-xml - id: end-of-file-fixer - id: mixed-line-ending + - id: check-executables-have-shebangs + exclude: | + (?x)^( + .*.bat| + .*.ps1 + )$ - id: trailing-whitespace exclude: | (?x)^( diff --git a/SECURITY.md b/SECURITY.md index 655ebdfcd..c5f8d33d5 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -29,8 +29,8 @@ Any code commit is validated before merge `SD.Next` library can establish external connections *only* for following purposes and *only* when explicitly configured by user: -- Download extensions and themes indexes from automatically updated indexes +- Download extensions and themes indexes from automatically updated indexes - Download required packages and repositories from GitHub during installation/upgrade - Download installed/enabled extensions - Download models from CivitAI and/or Huggingface when instructed by user -- Submit benchmark info upon user interaction +- Submit benchmark info upon user interaction diff --git a/modules/control/proc/normalbae/nets/submodules/efficientnet_repo/README.md b/modules/control/proc/normalbae/nets/submodules/efficientnet_repo/README.md index d8afeb4d1..026fd95d6 100644 --- a/modules/control/proc/normalbae/nets/submodules/efficientnet_repo/README.md +++ b/modules/control/proc/normalbae/nets/submodules/efficientnet_repo/README.md @@ -1,6 +1,6 @@ # (Generic) EfficientNets for PyTorch -A 'generic' implementation of EfficientNet, MixNet, MobileNetV3, etc. that covers most of the compute/parameter efficient architectures derived from the MobileNet V1/V2 block sequence, including those found via automated neural architecture search. +A 'generic' implementation of EfficientNet, MixNet, MobileNetV3, etc. that covers most of the compute/parameter efficient architectures derived from the MobileNet V1/V2 block sequence, including those found via automated neural architecture search. All models are implemented by GenEfficientNet or MobileNetV3 classes, with string based architecture definitions to configure the block layouts (idea from [here](https://github.com/tensorflow/tpu/blob/master/models/official/mnasnet/mnasnet_models.py)) @@ -20,7 +20,7 @@ All models are implemented by GenEfficientNet or MobileNetV3 classes, with strin * 4.5M param MobileNet-V2 110d @ 75% * 6.1M param MobileNet-V2 140 @ 76.5% * 5.8M param MobileNet-V2 120d @ 77.3% - + ### March 23, 2020 * Add EfficientNet-Lite models w/ weights ported from [Tensorflow TPU](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite) * Add PyTorch trained MobileNet-V3 Large weights with 75.77% top-1 @@ -39,7 +39,7 @@ All models are implemented by GenEfficientNet or MobileNetV3 classes, with strin ### Nov 22, 2019 * New top-1 high! Ported official TF EfficientNet AdvProp (https://arxiv.org/abs/1911.09665) weights and B8 model spec. Created a new set of `ap` models since they use a different preprocessing (Inception mean/std) from the original EfficientNet base/AA/RA weights. - + ### Nov 15, 2019 * Ported official TF MobileNet-V3 float32 large/small/minimalistic weights * Modifications to MobileNet-V3 model and components to support some additional config needed for differences between TF MobileNet-V3 and mine @@ -50,7 +50,7 @@ All models are implemented by GenEfficientNet or MobileNetV3 classes, with strin * Add JIT optimized mem-efficient Swish/Mish autograd.fn in addition to memory-efficient autgrad.fn * Activation factory to select best version of activation by name or override one globally * Add pretrained checkpoint load helper that handles input conv and classifier changes - + ### Oct 27, 2019 * Add CondConv EfficientNet variants ported from https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/condconv * Add RandAug weights for TF EfficientNet B5 and B7 from https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet @@ -75,8 +75,8 @@ Implemented models include: * MobileNet-V3 (https://arxiv.org/abs/1905.02244) * FBNet-C (https://arxiv.org/abs/1812.03443) * Single-Path NAS (https://arxiv.org/abs/1904.02877) - -I originally implemented and trained some these models with code [here](https://github.com/rwightman/pytorch-image-models), this repository contains just the GenEfficientNet models, validation, and associated ONNX/Caffe2 export code. + +I originally implemented and trained some these models with code [here](https://github.com/rwightman/pytorch-image-models), this repository contains just the GenEfficientNet models, validation, and associated ONNX/Caffe2 export code. ## Pretrained @@ -117,7 +117,7 @@ More pretrained models to come... The weights ported from Tensorflow checkpoints for the EfficientNet models do pretty much match accuracy in Tensorflow once a SAME convolution padding equivalent is added, and the same crop factors, image scaling, etc (see table) are used via cmd line args. -**IMPORTANT:** +**IMPORTANT:** * Tensorflow ported weights for EfficientNet AdvProp (AP), EfficientNet EdgeTPU, EfficientNet-CondConv, EfficientNet-Lite, and MobileNet-V3 models use Inception style (0.5, 0.5, 0.5) for mean and std. * Enabling the Tensorflow preprocessing pipeline with `--tf-preprocessing` at validation time will improve scores by 0.1-0.5%, very close to original TF impl. @@ -130,7 +130,7 @@ To run validation w/ TF preprocessing for tf_efficientnet_b5: To run validation for a model with Inception preprocessing, ie EfficientNet-B8 AdvProp: `python validate.py /path/to/imagenet/validation/ --model tf_efficientnet_b8_ap -b 48 --num-gpu 2 --img-size 672 --crop-pct 0.954 --mean 0.5 --std 0.5` -|Model | Prec@1 (Err) | Prec@5 (Err) | Param # | Image Scaling | Image Size | Crop | +|Model | Prec@1 (Err) | Prec@5 (Err) | Param # | Image Scaling | Image Size | Crop | |---|---|---|---|---|---|---| | tf_efficientnet_l2_ns *tfp | 88.352 (11.648) | 98.652 (1.348) | 480 | bicubic | 800 | N/A | | tf_efficientnet_l2_ns | TBD | TBD | 480 | bicubic | 800 | 0.961 | @@ -308,7 +308,7 @@ Scripts are included to As an example, to export the MobileNet-V3 pretrained model and then run an Imagenet validation: ``` python onnx_export.py --model mobilenetv3_large_100 ./mobilenetv3_100.onnx -python onnx_validate.py /imagenet/validation/ --onnx-input ./mobilenetv3_100.onnx +python onnx_validate.py /imagenet/validation/ --onnx-input ./mobilenetv3_100.onnx ``` These scripts were tested to be working as of PyTorch 1.6 and ONNX 1.7 w/ ONNX runtime 1.4. Caffe2 compatible diff --git a/modules/control/proc/zoe/zoedepth/models/base_models/midas_repo/README.md b/modules/control/proc/zoe/zoedepth/models/base_models/midas_repo/README.md index 9568ea71c..88ec845d1 100644 --- a/modules/control/proc/zoe/zoedepth/models/base_models/midas_repo/README.md +++ b/modules/control/proc/zoe/zoedepth/models/base_models/midas_repo/README.md @@ -2,24 +2,24 @@ This repository contains code to compute depth from a single image. It accompanies our [paper](https://arxiv.org/abs/1907.01341v3): ->Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer +>Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer René Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, Vladlen Koltun and our [preprint](https://arxiv.org/abs/2103.13413): -> Vision Transformers for Dense Prediction +> Vision Transformers for Dense Prediction > René Ranftl, Alexey Bochkovskiy, Vladlen Koltun MiDaS was trained on up to 12 datasets (ReDWeb, DIML, Movies, MegaDepth, WSVD, TartanAir, HRWSI, ApolloScape, BlendedMVS, IRS, KITTI, NYU Depth V2) with -multi-objective optimization. +multi-objective optimization. The original model that was trained on 5 datasets (`MIX 5` in the paper) can be found [here](https://github.com/isl-org/MiDaS/releases/tag/v2). The figure below shows an overview of the different MiDaS models; the bubble size scales with number of parameters. ![](figures/Improvement_vs_FPS.png) -### Setup +### Setup 1) Pick one or more models and download the corresponding weights to the `weights` folder: @@ -31,9 +31,9 @@ MiDaS 3.1 MiDaS 3.0: Legacy transformer models [dpt_large_384](https://github.com/isl-org/MiDaS/releases/download/v3/dpt_large_384.pt) and [dpt_hybrid_384](https://github.com/isl-org/MiDaS/releases/download/v3/dpt_hybrid_384.pt) -MiDaS 2.1: Legacy convolutional models [midas_v21_384](https://github.com/isl-org/MiDaS/releases/download/v2_1/midas_v21_384.pt) and [midas_v21_small_256](https://github.com/isl-org/MiDaS/releases/download/v2_1/midas_v21_small_256.pt) +MiDaS 2.1: Legacy convolutional models [midas_v21_384](https://github.com/isl-org/MiDaS/releases/download/v2_1/midas_v21_384.pt) and [midas_v21_small_256](https://github.com/isl-org/MiDaS/releases/download/v2_1/midas_v21_small_256.pt) -1) Set up dependencies: +1) Set up dependencies: ```shell conda env create -f environment.yaml @@ -53,7 +53,7 @@ For the OpenVINO model, install ```shell pip install openvino ``` - + ### Usage 1) Place one or more input images in the folder `input`. @@ -68,19 +68,19 @@ pip install openvino [dpt_swin2_tiny_256](#model_type), [dpt_swin_large_384](#model_type), [dpt_next_vit_large_384](#model_type), [dpt_levit_224](#model_type), [dpt_large_384](#model_type), [dpt_hybrid_384](#model_type), [midas_v21_384](#model_type), [midas_v21_small_256](#model_type), [openvino_midas_v21_small_256](#model_type). - + 3) The resulting depth maps are written to the `output` folder. #### optional 1) By default, the inference resizes the height of input images to the size of a model to fit into the encoder. This size is given by the numbers in the model names of the [accuracy table](#accuracy). Some models do not only support a single - inference height but a range of different heights. Feel free to explore different heights by appending the extra + inference height but a range of different heights. Feel free to explore different heights by appending the extra command line argument `--height`. Unsupported height values will throw an error. Note that using this argument may decrease the model accuracy. 2) By default, the inference keeps the aspect ratio of input images when feeding them into the encoder if this is supported by a model (all models except for Swin, Swin2, LeViT). In order to resize to a square resolution, - disregarding the aspect ratio while preserving the height, use the command line argument `--square`. + disregarding the aspect ratio while preserving the height, use the command line argument `--square`. #### via Camera @@ -91,7 +91,7 @@ pip install openvino python run.py --model_type --side ``` - The argument `--side` is optional and causes both the input RGB image and the output depth map to be shown + The argument `--side` is optional and causes both the input RGB image and the output depth map to be shown side-by-side for comparison. #### via Docker @@ -122,7 +122,7 @@ The pretrained model is also available on [PyTorch Hub](https://pytorch.org/hub/ See [README](https://github.com/isl-org/MiDaS/tree/master/tf) in the `tf` subdirectory. -Currently only supports MiDaS v2.1. +Currently only supports MiDaS v2.1. #### via Mobile (iOS / Android) @@ -133,16 +133,16 @@ See [README](https://github.com/isl-org/MiDaS/tree/master/mobile) in the `mobile See [README](https://github.com/isl-org/MiDaS/tree/master/ros) in the `ros` subdirectory. -Currently only supports MiDaS v2.1. DPT-based models to be added. +Currently only supports MiDaS v2.1. DPT-based models to be added. ### Accuracy We provide a **zero-shot error** $\epsilon_d$ which is evaluated for 6 different datasets -(see [paper](https://arxiv.org/abs/1907.01341v3)). **Lower error values are better**. +(see [paper](https://arxiv.org/abs/1907.01341v3)). **Lower error values are better**. $\color{green}{\textsf{Overall model quality is represented by the improvement}}$ ([Imp.](#improvement)) with respect to -MiDaS 3.0 DPTL-384. The models are grouped by the height used for inference, whereas the square training resolution is given by -the numbers in the model names. The table also shows the **number of parameters** (in millions) and the +MiDaS 3.0 DPTL-384. The models are grouped by the height used for inference, whereas the square training resolution is given by +the numbers in the model names. The table also shows the **number of parameters** (in millions) and the **frames per second** for inference at the training resolution (for GPU RTX 3090): | MiDaS Model | DIW
WHDR | Eth3d
AbsRel | Sintel
AbsRel | TUM
δ1 | KITTI
δ1 | NYUv2
δ1 | $\color{green}{\textsf{Imp.}}$
% | Par.
M | FPS
  | @@ -171,16 +171,16 @@ the numbers in the model names. The table also shows the **number of parameters* | [v3.1 LeViT224](https://github.com/isl-org/MiDaS/releases/download/v3_1/dpt_levit_224.pt)$\tiny{\square}$ | **0.1314** | **0.1206** | **0.3148** | **18.21** | **15.27*** | **8.64*** | $\color{green}{\textsf{-40}}$ | **51** | **73** | * No zero-shot error, because models are also trained on KITTI and NYU Depth V2\ -$\square$ Validation performed at **square resolution**, either because the transformer encoder backbone of a model -does not support non-square resolutions (Swin, Swin2, LeViT) or for comparison with these models. All other +$\square$ Validation performed at **square resolution**, either because the transformer encoder backbone of a model +does not support non-square resolutions (Swin, Swin2, LeViT) or for comparison with these models. All other validations keep the aspect ratio. A difference in resolution limits the comparability of the zero-shot error and the -improvement, because these quantities are averages over the pixels of an image and do not take into account the +improvement, because these quantities are averages over the pixels of an image and do not take into account the advantage of more details due to a higher resolution.\ Best values per column and same validation height in bold #### Improvement -The improvement in the above table is defined as the relative zero-shot error with respect to MiDaS v3.0 +The improvement in the above table is defined as the relative zero-shot error with respect to MiDaS v3.0 DPTL-384 and averaging over the datasets. So, if $\epsilon_d$ is the zero-shot error for dataset $d$, then the $\color{green}{\textsf{improvement}}$ is given by $100(1-(1/6)\sum_d\epsilon_d/\epsilon_{d,\rm{DPT_{L-384}}})$%. @@ -193,14 +193,14 @@ and v2.0 Large384 respectively instead of v3.0 DPTL-384. Zoom in for better visibility ![](figures/Comparison.png) -### Speed on Camera Feed +### Speed on Camera Feed -Test configuration -- Windows 10 -- 11th Gen Intel Core i7-1185G7 3.00GHz -- 16GB RAM -- Camera resolution 640x480 -- openvino_midas_v21_small_256 +Test configuration +- Windows 10 +- 11th Gen Intel Core i7-1185G7 3.00GHz +- 16GB RAM +- Camera resolution 640x480 +- openvino_midas_v21_small_256 Speed: 22 FPS @@ -251,9 +251,9 @@ If you use a DPT-based model, please also cite: ### Acknowledgements -Our work builds on and uses code from [timm](https://github.com/rwightman/pytorch-image-models) and [Next-ViT](https://github.com/bytedance/Next-ViT). +Our work builds on and uses code from [timm](https://github.com/rwightman/pytorch-image-models) and [Next-ViT](https://github.com/bytedance/Next-ViT). We'd like to thank the authors for making these libraries available. -### License +### License -MIT License +MIT License