Update utility code

2024-04-30 20:09:19 -04:00 · 2024-04-30 20:09:19 -04:00 · 91350e5581
parent 7bbc99d91b
commit 91350e5581
13 changed files with 873 additions and 422 deletions
--- a/.release
+++ b/.release
@ -1 +1 @@
-v24.0.9
+v24.1.0
--- a/README.md
+++ b/README.md
@ -47,29 +47,7 @@ The GUI allows you to set the training parameters and generate and run the requi
  - [SDXL training](#sdxl-training)
  - [Masked loss](#masked-loss)
  - [Change History](#change-history)
-    - [2024/04/28 (v24.0.9)](#20240428-v2409)
-    - [2024/04/26 (v24.0.8)](#20240426-v2408)
-    - [2024/04/25 (v24.0.7)](#20240425-v2407)
-    - [2024/04/22 (v24.0.6)](#20240422-v2406)
-    - [2024/04/19 (v24.0.5)](#20240419-v2405)
-      - [New Contributors](#new-contributors)
-    - [2024/04/18 (v24.0.4)](#20240418-v2404)
-      - [What's Changed](#whats-changed)
-      - [New Contributors](#new-contributors-1)
-    - [2024/04/24 (v24.0.3)](#20240424-v2403)
-    - [2024/04/24 (v24.0.2)](#20240424-v2402)
-    - [2024/04/17 (v24.0.1)](#20240417-v2401)
-      - [Enhancements](#enhancements)
-      - [Security and Stability](#security-and-stability)
-      - [Shell Execution](#shell-execution)
-      - [Miscellaneous](#miscellaneous)
-    - [2024/04/10 (v23.1.5)](#20240410-v2315)
-      - [Security Improvements](#security-improvements)
-    - [2024/04/08 (v23.1.4)](#20240408-v2314)
-    - [2024/04/08 (v23.1.3)](#20240408-v2313)
-    - [2024/04/08 (v23.1.2)](#20240408-v2312)
-    - [2024/04/07 (v23.1.1)](#20240407-v2311)
-    - [2024/04/07 (v23.1.0)](#20240407-v2310)
+    - [v24.1.0](#v2410)

 ## 🦒 Colab

@ -455,202 +433,6 @@ ControlNet dataset is used to specify the mask. The mask images should be the RG

 ## Change History

-### 2024/04/28 (v24.0.9)
+### v24.1.0

- Updated the temporary configuration file to include date and time information in the file name. This will allow for easier batching of multiple training commands, particularly useful for users who want to automate their training sessions.
- Fixed an issue with wd14 captioning where the captioning process was not functioning correctly when the recursive option was set to true. Prefixes and postfixes are now applied to all caption files in the folder.
-
-### 2024/04/26 (v24.0.8)
-
- Set `max_train_steps` to 0 if not specified in older `.json` config files.
-
-### 2024/04/25 (v24.0.7)
-
- Prevent crash if tkinter is not installed
- Fix [24.0.6] Train toml config seed type error #2370
- A new docker container is now built with every new release, eliminating the need for manual building. A big thank you to @jim60105 for his hard work in this area. You can find more information about it in the Docker section of the README.
-
-### 2024/04/22 (v24.0.6)
-
- Make start and stop buttons visible in headless
- Add validation for lr and optimizer arguments
-
-### 2024/04/19 (v24.0.5)
-
- Hide tensorboard button if tensorflow module is not installed by @bmaltais in <https://github.com/bmaltais/kohya_ss/pull/2347>
- wd14 captioning issue with undesired tags nor tag replacement by @bmaltais in <https://github.com/bmaltais/kohya_ss/pull/2350>
- Changed logger checkbox to dropdown, renamed use_wandb -> log_with by @ccharest93 in <https://github.com/bmaltais/kohya_ss/pull/2352>
-
-#### New Contributors
-
- @ccharest93 made their first contribution in <https://github.com/bmaltais/kohya_ss/pull/2352>
-
-### 2024/04/18 (v24.0.4)
-
-#### What's Changed
-
- Fix options.md heading by @bmaltais in <https://github.com/bmaltais/kohya_ss/pull/2337>
- Use correct file extensions when browsing for model file by @b-fission in <https://github.com/bmaltais/kohya_ss/pull/2323>
- Add argument for Gradio's `root_path` to enable reverse proxy support by @hlky in <https://github.com/bmaltais/kohya_ss/pull/2333>
- 2325 quotes wrapping python path cause subprocess cant find target in v2403 by @bmaltais in <https://github.com/bmaltais/kohya_ss/pull/2338>
- 2330 another seemingly new data validation leads to unusable configs 2403 by @bmaltais in <https://github.com/bmaltais/kohya_ss/pull/2339>
- Fix bad Lora parameters by @bmaltais in <https://github.com/bmaltais/kohya_ss/pull/2341>
-
-#### New Contributors
-
- @b-fission made their first contribution in <https://github.com/bmaltais/kohya_ss/pull/2323>
- @hlky made their first contribution in <https://github.com/bmaltais/kohya_ss/pull/2333>
-
-### 2024/04/24 (v24.0.3)
-
- Fix issue with sample prompt creation
-
-### 2024/04/24 (v24.0.2)
-
- Fixed issue with clip_skip not being passed as an int to sd-scripts when using old config.json files.
-
-### 2024/04/17 (v24.0.1)
-
-#### Enhancements
-
- **User Interface:** Transitioned the GUI to use a TOML file for argument passing to sd-scripts, significantly enhancing security by eliminating the need for command-line interface (CLI) use for sensitive data.
- **Training Tools:** Improved the training and TensorBoard buttons to provide a more intuitive user experience.
- **HuggingFace Integration:** Integrated a HuggingFace section in all trainer tabs, enabling authentication and use of HuggingFace's advanced AI models.
- **Gradio Upgrade:** Upgraded Gradio to version 4.20.0 to fix a previously identified bug impacting the runpod platform.
- **Metadata Support:** Added functionality for metadata capture within the GUI.
-
-#### Security and Stability
-
- **Code Refactoring:** Extensively rewrote the code to address various security vulnerabilities, including removing the `shell=True` parameter from process calls.
- **Scheduler Update:** Disabled LR Warmup when using the Constant LR Scheduler to prevent traceback errors associated with sd-scripts.
-
-#### Shell Execution
-
- **Conditional Shell Usage:** Added support for optional shell usage when executing external sd-scripts commands, tailored to meet specific platform needs and recent security updates.
-
-The `gui.bat` and `gui.sh` scripts now include the `--do_not_use_shell` argument to prevent shell execution (`shell=True`) during external process handling.The GUI will automatically set `use_shell` to True internally, as required for proper execution of external commands. To enforce disabling shell execution, use the `--do_not_use_shell` argument.
-
- **How to Enable Shell Execution via Config File:**
-  1. In the `config.toml` file, set `use_shell` to `true` to enable shell usage as per GUI startup settings.
-  **Note:** The `--do_not_use_shell` option will override the `config.toml` settings, setting `use_shell` to False even if it is set to True in the config file.
-
-#### Miscellaneous
-
- Made various other minor improvements and bug fixes to enhance overall functionality and user experience.
- Fixed an issue with existing LoRA network weights were not properly loaded prior to training
-
-### 2024/04/10 (v23.1.5)
-
- Fix issue with Textual Inversion configuration file selection.
- Upgrade to gradio 4.19.2 to fix several high security risks associated to earlier versions. This is a major upgrade, moving from 3.x to 4.x. Hoping this will not introduce undorseen issues.
- Upgrade transformers to 4.38.0 to fix a low severity security issue.
-
-#### Security Improvements
-
- Add explicit --do_not_share parameter to kohya_gui.py to avoid sharing the GUI on platforms like Kaggle.
- Remove shell=True from subprocess calls to avoid security issues when using the GUI.
- Limit caption extensions to a fixed set of extensions to limit the risk of finding and replacing text content in unexpected files.
-
-### 2024/04/08 (v23.1.4)
-
- Relocate config accordion to the top of the GUI.
-
-### 2024/04/08 (v23.1.3)
-
- Fix dataset preparation bug.
-
-### 2024/04/08 (v23.1.2)
-
- Added config.toml support for wd14_caption.
-
-### 2024/04/07 (v23.1.1)
-
- Added support for Huber loss under the Parameters / Advanced tab.
-
-### 2024/04/07 (v23.1.0)
-
- Update sd-scripts to 0.8.7
-  - The default value of `huber_schedule` in Scheduled Huber Loss is changed from `exponential` to `snr`, which is expected to give better results.
-
-  - Highlights
-    - The dependent libraries are updated. Please see [Upgrade](#upgrade) and update the libraries.
-      - Especially `imagesize` is newly added, so if you cannot update the libraries immediately, please install with `pip install imagesize==1.4.1` separately.
-      - `bitsandbytes==0.43.0`, `prodigyopt==1.0`, `lion-pytorch==0.0.6` are included in the requirements.txt.
-        - `bitsandbytes` no longer requires complex procedures as it now officially supports Windows.  
-      - Also, the PyTorch version is updated to 2.1.2 (PyTorch does not need to be updated immediately). In the upgrade procedure, PyTorch is not updated, so please manually install or update torch, torchvision, xformers if necessary (see [Upgrade PyTorch](#upgrade-pytorch)).
-    - When logging to wandb is enabled, the entire command line is exposed. Therefore, it is recommended to write wandb API key and HuggingFace token in the configuration file (`.toml`). Thanks to bghira for raising the issue.
-      - A warning is displayed at the start of training if such information is included in the command line.
-      - Also, if there is an absolute path, the path may be exposed, so it is recommended to specify a relative path or write it in the configuration file. In such cases, an INFO log is displayed.
-      - See [#1123](https://github.com/kohya-ss/sd-scripts/pull/1123) and PR [#1240](https://github.com/kohya-ss/sd-scripts/pull/1240) for details.
-    - Colab seems to stop with log output. Try specifying `--console_log_simple` option in the training script to disable rich logging.
-    - Other improvements include the addition of masked loss, scheduled Huber Loss, DeepSpeed support, dataset settings improvements, and image tagging improvements. See below for details.
-
-  - Training scripts
-    - `train_network.py` and `sdxl_train_network.py` are modified to record some dataset settings in the metadata of the trained model (`caption_prefix`, `caption_suffix`, `keep_tokens_separator`, `secondary_separator`, `enable_wildcard`).
-    - Fixed a bug that U-Net and Text Encoders are included in the state in `train_network.py` and `sdxl_train_network.py`. The saving and loading of the state are faster, the file size is smaller, and the memory usage when loading is reduced.
-    - DeepSpeed is supported. PR [#1101](https://github.com/kohya-ss/sd-scripts/pull/1101)  and [#1139](https://github.com/kohya-ss/sd-scripts/pull/1139) Thanks to BootsofLagrangian! See PR [#1101](https://github.com/kohya-ss/sd-scripts/pull/1101) for details.
-    - The masked loss is supported in each training script. PR [#1207](https://github.com/kohya-ss/sd-scripts/pull/1207) See [Masked loss](#masked-loss) for details.
-    - Scheduled Huber Loss has been introduced to each training scripts. PR [#1228](https://github.com/kohya-ss/sd-scripts/pull/1228/) Thanks to kabachuha for the PR and cheald, drhead, and others for the discussion! See the PR and [Scheduled Huber Loss](./docs/train_lllite_README.md#scheduled-huber-loss) for details.
-    - The options `--noise_offset_random_strength` and `--ip_noise_gamma_random_strength` are added to each training script. These options can be used to vary the noise offset and ip noise gamma in the range of 0 to the specified value. PR [#1177](https://github.com/kohya-ss/sd-scripts/pull/1177) Thanks to KohakuBlueleaf!
-    - The options `--save_state_on_train_end` are added to each training script. PR [#1168](https://github.com/kohya-ss/sd-scripts/pull/1168) Thanks to gesen2egee!
-    - The options `--sample_every_n_epochs` and `--sample_every_n_steps` in each training script now display a warning and ignore them when a number less than or equal to `0` is specified. Thanks to S-Del for raising the issue.
-
-  - Dataset settings
-    - The [English version of the dataset settings documentation](./docs/config_README-en.md) is added. PR [#1175](https://github.com/kohya-ss/sd-scripts/pull/1175) Thanks to darkstorm2150!
-    - The `.toml` file for the dataset config is now read in UTF-8 encoding. PR [#1167](https://github.com/kohya-ss/sd-scripts/pull/1167) Thanks to Horizon1704!
-    - Fixed a bug that the last subset settings are applied to all images when multiple subsets of regularization images are specified in the dataset settings. The settings for each subset are correctly applied to each image. PR [#1205](https://github.com/kohya-ss/sd-scripts/pull/1205) Thanks to feffy380!
-    - Some features are added to the dataset subset settings.
-      - `secondary_separator` is added to specify the tag separator that is not the target of shuffling or dropping.
-        - Specify `secondary_separator=";;;"`. When you specify `secondary_separator`, the part is not shuffled or dropped.
-      - `enable_wildcard` is added. When set to `true`, the wildcard notation `{aaa|bbb|ccc}` can be used. The multi-line caption is also enabled.
-      - `keep_tokens_separator` is updated to be used twice in the caption. When you specify `keep_tokens_separator="|||"`, the part divided by the second `|||` is not shuffled or dropped and remains at the end.
-      - The existing features `caption_prefix` and `caption_suffix` can be used together. `caption_prefix` and `caption_suffix` are processed first, and then `enable_wildcard`, `keep_tokens_separator`, shuffling and dropping, and `secondary_separator` are processed in order.
-      - See [Dataset config](./docs/config_README-en.md) for details.
-    - The dataset with DreamBooth method supports caching image information (size, caption). PR [#1178](https://github.com/kohya-ss/sd-scripts/pull/1178) and [#1206](https://github.com/kohya-ss/sd-scripts/pull/1206) Thanks to KohakuBlueleaf! See [DreamBooth method specific options](./docs/config_README-en.md#dreambooth-specific-options) for details.
-
-  - Image tagging (not implemented yet in the GUI)
-    - The support for v3 repositories is added to `tag_image_by_wd14_tagger.py` (`--onnx` option only). PR [#1192](https://github.com/kohya-ss/sd-scripts/pull/1192) Thanks to sdbds!
-      - Onnx may need to be updated. Onnx is not installed by default, so please install or update it with `pip install onnx==1.15.0 onnxruntime-gpu==1.17.1` etc. Please also check the comments in `requirements.txt`.
-    - The model is now saved in the subdirectory as `--repo_id` in `tag_image_by_wd14_tagger.py` . This caches multiple repo_id models. Please delete unnecessary files under `--model_dir`.
-    - Some options are added to `tag_image_by_wd14_tagger.py`.
-      - Some are added in PR [#1216](https://github.com/kohya-ss/sd-scripts/pull/1216) Thanks to Disty0!
-      - Output rating tags `--use_rating_tags` and `--use_rating_tags_as_last_tag`
-      - Output character tags first `--character_tags_first`
-      - Expand character tags and series `--character_tag_expand`
-      - Specify tags to output first `--always_first_tags`
-      - Replace tags `--tag_replacement`
-      - See [Tagging documentation](./docs/wd14_tagger_README-en.md) for details.
-    - Fixed an error when specifying `--beam_search` and a value of 2 or more for `--num_beams` in `make_captions.py`.
-
-  - About Masked loss
-    The masked loss is supported in each training script. To enable the masked loss, specify the `--masked_loss` option.
-
-    The feature is not fully tested, so there may be bugs. If you find any issues, please open an Issue.
-
-    ControlNet dataset is used to specify the mask. The mask images should be the RGB images. The pixel value 255 in R channel is treated as the mask (the loss is calculated only for the pixels with the mask), and 0 is treated as the non-mask. The pixel values 0-255 are converted to 0-1 (i.e., the pixel value 128 is treated as the half weight of the loss). See details for the dataset specification in the [LLLite documentation](./docs/train_lllite_README.md#preparing-the-dataset).
-
-  - About Scheduled Huber Loss
-    Scheduled Huber Loss has been introduced to each training scripts. This is a method to improve robustness against outliers or anomalies (data corruption) in the training data.
-
-    With the traditional MSE (L2) loss function, the impact of outliers could be significant, potentially leading to a degradation in the quality of generated images. On the other hand, while the Huber loss function can suppress the influence of outliers, it tends to compromise the reproduction of fine details in images.
-
-    To address this, the proposed method employs a clever application of the Huber loss function. By scheduling the use of Huber loss in the early stages of training (when noise is high) and MSE in the later stages, it strikes a balance between outlier robustness and fine detail reproduction.
-
-    Experimental results have confirmed that this method achieves higher accuracy on data containing outliers compared to pure Huber loss or MSE. The increase in computational cost is minimal.
-
-    The newly added arguments loss_type, huber_schedule, and huber_c allow for the selection of the loss function type (Huber, smooth L1, MSE), scheduling method (exponential, constant, SNR), and Huber's parameter. This enables optimization based on the characteristics of the dataset.
-
-    See PR [#1228](https://github.com/kohya-ss/sd-scripts/pull/1228/) for details.
-
-    - `loss_type`: Specify the loss function type. Choose `huber` for Huber loss, `smooth_l1` for smooth L1 loss, and `l2` for MSE loss. The default is `l2`, which is the same as before.
-    - `huber_schedule`: Specify the scheduling method. Choose `exponential`, `constant`, or `snr`. The default is `snr`.
-    - `huber_c`: Specify the Huber's parameter. The default is `0.1`.
-
-    Please read [Releases](https://github.com/kohya-ss/sd-scripts/releases) for recent updates.`
-
- Added GUI support for the new parameters listed above.
- Moved accelerate launch parameters to a new `Accelerate launch` accordion above the `Model` accordion.
- Added support for `Debiased Estimation loss` to Dreambooth settings.
- Added support for "Dataset Preparation" defaults via the config.toml file.
- Added a field to allow for the input of extra accelerate launch arguments.
- Added new caption tool from <https://github.com/kainatquaderee>
+- To ensure cross-platform compatibility and security, the GUI now defaults to using "shell=False" when running subprocesses. This is based on documentation and should not cause issues on most platforms. However, some users have reported issues on specific platforms such as runpod and colab. PLease open an issue if you encounter any issues.
--- a/kohya_gui/basic_caption_gui.py
+++ b/kohya_gui/basic_caption_gui.py
@ -44,16 +44,18 @@ def caption_images(
    Returns:
        None
    """
-    # Check if images_dir is provided
+    # Check if images_dir and caption_ext are provided
+    missing_parameters = []
    if not images_dir:
-        log.info(
-            "Image folder is missing. Please provide the directory containing the images to caption."
-        )
-        return
-
-    # Check if caption_ext is provided
+        missing_parameters.append("image directory")
    if not caption_ext:
-        log.info("Please provide an extension for the caption files.")
+        missing_parameters.append("caption file extension")
+        
+    if missing_parameters:
+        log.info(
+            "The following parameter(s) are missing: {}. "
+            "Please provide these to proceed with captioning the images.".format(", ".join(missing_parameters))
+        )
        return

    # Log the captioning process
@ -93,22 +95,23 @@ def caption_images(

    # Check if overwrite option is enabled
    if overwrite:
-        # Add prefix and postfix to caption files
-        if prefix or postfix:
+        # Add prefix and postfix to caption files or find and replace text in caption files
+        if prefix or postfix or find_text:
+            # Add prefix and/or postfix to caption files
            add_pre_postfix(
                folder=images_dir,
                caption_file_ext=caption_ext,
                prefix=prefix,
                postfix=postfix,
            )
-        # Find and replace text in caption files
-        if find_text:
-            find_replace(
-                folder_path=images_dir,
-                caption_file_ext=caption_ext,
-                search_text=find_text,
-                replace_text=replace_text,
-            )
+            # Replace specified text in caption files if find and replace text is provided
+            if find_text and replace_text:
+                find_replace(
+                    folder_path=images_dir,
+                    caption_file_ext=caption_ext,
+                    search_text=find_text,
+                    replace_text=replace_text,
+                )
    else:
        # Show a message if modification is not possible without overwrite option enabled
        if prefix or postfix:
--- a/kohya_gui/common_gui.py
+++ b/kohya_gui/common_gui.py
@ -751,6 +751,7 @@ def add_pre_postfix(
        prefix (str, optional): Prefix to add to the content of the caption files.
        postfix (str, optional): Postfix to add to the content of the caption files.
        caption_file_ext (str, optional): Extension of the caption files.
+        recursive (bool, optional): Whether to search for caption files recursively.
    """
    # If neither prefix nor postfix is provided, return early
    if prefix == "" and postfix == "":
@ -775,33 +776,39 @@ def add_pre_postfix(
    # Iterate over the list of image files
    for image_file in image_files:
        # Construct the caption file name by appending the caption file extension to the image file name
-        caption_file_name = os.path.splitext(image_file)[0] + caption_file_ext
+        caption_file_name = f"{os.path.splitext(image_file)[0]}{caption_file_ext}"
        # Construct the full path to the caption file
        caption_file_path = os.path.join(folder, caption_file_name)

        # Check if the caption file does not exist
        if not os.path.exists(caption_file_path):
            # Create a new caption file with the specified prefix and/or postfix
-            with open(caption_file_path, "w", encoding="utf-8") as f:
-                # Determine the separator based on whether both prefix and postfix are provided
-                separator = " " if prefix and postfix else ""
-                f.write(f"{prefix}{separator}{postfix}")
+            try:
+                with open(caption_file_path, "w", encoding="utf-8") as f:
+                    # Determine the separator based on whether both prefix and postfix are provided
+                    separator = " " if prefix and postfix else ""
+                    f.write(f"{prefix}{separator}{postfix}")
+            except Exception as e:
+                log.error(f"Error writing to file {caption_file_path}: {e}")
        else:
            # Open the existing caption file for reading and writing
-            with open(caption_file_path, "r+", encoding="utf-8") as f:
-                # Read the content of the caption file, stripping any trailing whitespace
-                content = f.read().rstrip()
-                # Move the file pointer to the beginning of the file
-                f.seek(0, 0)
+            try:
+                with open(caption_file_path, "r+", encoding="utf-8") as f:
+                    # Read the content of the caption file, stripping any trailing whitespace
+                    content = f.read().rstrip()
+                    # Move the file pointer to the beginning of the file
+                    f.seek(0, 0)

-                # Determine the separator based on whether only prefix is provided
-                prefix_separator = " " if prefix else ""
-                # Determine the separator based on whether only postfix is provided
-                postfix_separator = " " if postfix else ""
-                # Write the updated content to the caption file, adding prefix and/or postfix
-                f.write(
-                    f"{prefix}{prefix_separator}{content}{postfix_separator}{postfix}"
-                )
+                    # Determine the separator based on whether only prefix is provided
+                    prefix_separator = " " if prefix else ""
+                    # Determine the separator based on whether only postfix is provided
+                    postfix_separator = " " if postfix else ""
+                    # Write the updated content to the caption file, adding prefix and/or postfix
+                    f.write(
+                        f"{prefix}{prefix_separator}{content}{postfix_separator}{postfix}"
+                    )
+            except Exception as e:
+                log.error(f"Error writing to file {caption_file_path}: {e}")


 def has_ext_files(folder_path: str, file_extension: str) -> bool:
--- a/kohya_gui/convert_lcm_gui.py
+++ b/kohya_gui/convert_lcm_gui.py
@ -80,15 +80,42 @@ def convert_lcm(


 def gradio_convert_lcm_tab(headless=False):
+    """
+    Creates a Gradio tab for converting a model to an LCM model.
+
+    Args:
+    headless (bool): If True, the tab will be created without any visible elements.
+
+    Returns:
+    None
+    """
    current_model_dir = os.path.join(scriptdir, "outputs")
    current_save_dir = os.path.join(scriptdir, "outputs")

    def list_models(path):
+        """
+        Lists all model files in the given directory.
+
+        Args:
+        path (str): The directory path to search for model files.
+
+        Returns:
+        list: A list of model file paths.
+        """
        nonlocal current_model_dir
        current_model_dir = path
        return list(list_files(path, exts=[".safetensors"], all=True))

    def list_save_to(path):
+        """
+        Lists all save-to options for the given directory.
+
+        Args:
+        path (str): The directory path to search for save-to options.
+
+        Returns:
+        list: A list of save-to options.
+        """
        nonlocal current_save_dir
        current_save_dir = path
        return list(list_files(path, exts=[".safetensors"], all=True))
--- a/test.md
+++ b/test.md
@ -0,0 +1,452 @@
+![](portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png)
+
+#  Wikipedia  **The Free Encyclopedia**
+
+[ **English** 6,796,000+ articles ](//en.wikipedia.org/ "English â Wikipedia
+â The Free Encyclopedia")
+
+[ **EspaÃ±ol** 1.938.000+ artÃculos ](//es.wikipedia.org/ "EspaÃ±ol â
+Wikipedia â La enciclopedia libre")
+
+[ **Ð ÑÑÑÐºÐ¸Ð¹** 1Â 969Â 000+ ÑÑÐ°ÑÐµÐ¹ ](//ru.wikipedia.org/ "Russkiy
+â ÐÐ¸ÐºÐ¸Ð¿ÐµÐ´Ð¸Ñ â Ð¡Ð²Ð¾Ð±Ð¾Ð´Ð½Ð°Ñ ÑÐ½ÑÐ¸ÐºÐ»Ð¾Ð¿ÐµÐ´Ð¸Ñ")
+
+[ **æ¥æ¬èª** 1,407,000+ è¨äº ](//ja.wikipedia.org/ "Nihongo â
+ã¦ã£ãããã£ã¢ â ããªã¼ç¾ç§äºå¸")
+
+[ **Deutsch** 2.891.000+ Artikel ](//de.wikipedia.org/ "Deutsch â Wikipedia
+â Die freie EnzyklopÃ¤die")
+
+[ **FranÃ§ais** 2â¯598â¯000+ articles ](//fr.wikipedia.org/ "franÃ§ais â
+WikipÃ©dia â LâencyclopÃ©die libre")
+
+[ **Italiano** 1.853.000+ voci ](//it.wikipedia.org/ "Italiano â Wikipedia
+â L'enciclopedia libera")
+
+[ **ä¸æ** 1,409,000+ æ¡ç® / æ¢ç® ](//zh.wikipedia.org/ "ZhÅngwÃ©n â
+ç»´åºç¾ç§ / ç¶åºç¾ç§ â èªç±çç¾ç§å¨ä¹¦ /
+èªç±çç¾ç§å¨æ¸")
+
+[ **ÙØ§Ø±Ø³Û** Û¹Û¹ÛµÙ¬Û°Û°Û°+ Ù ÙØ§ÙÙ ](//fa.wikipedia.org/ "FÄrsi â
+ÙÛÚ©ÛâÙ¾Ø¯ÛØ§ â Ø¯Ø§ÙØ´ÙØ§ÙÙÙ Ø¢Ø²Ø§Ø¯")
+
+[ **PortuguÃªs** 1.120.000+ artigos ](//pt.wikipedia.org/ "PortuguÃªs â
+WikipÃ©dia â A enciclopÃ©dia livre")
+
+Search Wikipedia
+
+Afrikaans Polski Ø§ÙØ¹Ø±Ø¨ÙØ© Asturianu AzÉrbaycanca ÐÑÐ»Ð³Ð°ÑÑÐºÐ¸
+é©åèª / BÃ¢n-lÃ¢m-gÃº à¦¬à¦¾à¦à¦²à¦¾ ÐÐµÐ»Ð°ÑÑÑÐºÐ°Ñ CatalÃ
+ÄeÅ¡tina Cymraeg Dansk Deutsch Eesti ÎÎ»Î»Î·Î½Î¹ÎºÎ¬ English EspaÃ±ol
+Esperanto Euskara ÙØ§Ø±Ø³Û FranÃ§ais Galego íêµì´ ÕÕ¡ÕµÕ¥ÖÕ¥Õ¶
+à¤¹à¤¿à¤¨à¥à¤¦à¥ Hrvatski Bahasa Indonesia Italiano ×¢××¨××ª á¥áá
+áá£áá Ladin Latina LatvieÅ¡u LietuviÅ³ Magyar ÐÐ°ÐºÐµÐ´Ð¾Ð½ÑÐºÐ¸ Ù
+ØµØ±Ù Bahasa Melayu Bahaso Minangkabau áá¼ááºáá¬áá¬áá¬
+Nederlands æ¥æ¬èª Norsk (bokmÃ¥l) Norsk (nynorsk) ÐÐ¾Ñ ÑÐ¸Ð¹Ð½ OÊ»zbekcha
+/ ÐÐ·Ð±ÐµÐºÑÐ° PortuguÃªs ÒÐ°Ð·Ð°ÒÑÐ° / QazaqÅa / ÙØ§Ø²Ø§ÙØ´Ø§
+RomÃ¢nÄ Ð ÑÑÑÐºÐ¸Ð¹ Simple English Sinugboanong Binisaya SlovenÄina
+SlovenÅ¡Äina Ð¡ÑÐ¿ÑÐºÐ¸ / Srpski Srpskohrvatski / Ð¡ÑÐ¿ÑÐºÐ¾Ñ
+ÑÐ²Ð°ÑÑÐºÐ¸ Suomi Svenska à®¤à®®à®¿à®´à¯ Ð¢Ð°ÑÐ°ÑÑÐ° / TatarÃ§a à¸
+à¸²à¸©à¸²à¹à¸à¸¢ Ð¢Ð¾Ò·Ð¸ÐºÓ£ ØªÛØ±Ú©Ø¬Ù TÃ¼rkÃ§e Ð£ÐºÑÐ°ÑÐ½ÑÑÐºÐ°
+Ø§Ø±Ø¯Ù Tiáº¿ng Viá»t Winaray ä¸æ ç²µèª
+
+__
+
+_Search_
+
+__ Read Wikipedia in your language  __
+
+##  1,000,000+  articles
+
+  * [Polski](//pl.wikipedia.org/)
+  * [Ø§ÙØ¹Ø±Ø¨ÙØ©](//ar.wikipedia.org/ "Al-Ê¿ArabÄ«yah")
+  * [Deutsch](//de.wikipedia.org/)
+  * [English](//en.wikipedia.org/ "English")
+  * [EspaÃ±ol](//es.wikipedia.org/)
+  * [FranÃ§ais](//fr.wikipedia.org/ "franÃ§ais")
+  * [Italiano](//it.wikipedia.org/)
+  * [Ù ØµØ±Ù](//arz.wikipedia.org/ "Maá¹£rÄ«")
+  * [Nederlands](//nl.wikipedia.org/)
+  * [æ¥æ¬èª](//ja.wikipedia.org/ "Nihongo")
+  * [PortuguÃªs](//pt.wikipedia.org/)
+  * [Ð ÑÑÑÐºÐ¸Ð¹](//ru.wikipedia.org/ "Russkiy")
+  * [Sinugboanong Binisaya](//ceb.wikipedia.org/)
+  * [Svenska](//sv.wikipedia.org/)
+  * [Ð£ÐºÑÐ°ÑÐ½ÑÑÐºÐ°](//uk.wikipedia.org/ "Ukrayinsâka")
+  * [Tiáº¿ng Viá»t](//vi.wikipedia.org/)
+  * [Winaray](//war.wikipedia.org/)
+  * [ä¸æ](//zh.wikipedia.org/ "ZhÅngwÃ©n")
+
+##  100,000+  articles
+
+  * [Afrikaans](//af.wikipedia.org/ "Afrikaans")
+  * [Asturianu](//ast.wikipedia.org/)
+  * [AzÉrbaycanca](//az.wikipedia.org/)
+  * [ÐÑÐ»Ð³Ð°ÑÑÐºÐ¸](//bg.wikipedia.org/ "BÇlgarski")
+  * [é©åèª / BÃ¢n-lÃ¢m-gÃº](//zh-min-nan.wikipedia.org/ "BÃ¢n-lÃ¢m-gÃº")
+  * [à¦¬à¦¾à¦à¦²à¦¾](//bn.wikipedia.org/ "Bangla")
+  * [ÐÐµÐ»Ð°ÑÑÑÐºÐ°Ñ](//be.wikipedia.org/ "Belaruskaya")
+  * [CatalÃ ](//ca.wikipedia.org/)
+  * [ÄeÅ¡tina](//cs.wikipedia.org/ "ÄeÅ¡tina")
+  * [Cymraeg](//cy.wikipedia.org/ "Cymraeg")
+  * [Dansk](//da.wikipedia.org/)
+  * [Eesti](//et.wikipedia.org/)
+  * [ÎÎ»Î»Î·Î½Î¹ÎºÎ¬](//el.wikipedia.org/ "EllÄ«nikÃ¡")
+  * [Esperanto](//eo.wikipedia.org/)
+  * [Euskara](//eu.wikipedia.org/)
+  * [ÙØ§Ø±Ø³Û](//fa.wikipedia.org/ "FÄrsi")
+  * [Galego](//gl.wikipedia.org/)
+  * [íêµì´](//ko.wikipedia.org/ "Hangugeo")
+  * [ÕÕ¡ÕµÕ¥ÖÕ¥Õ¶](//hy.wikipedia.org/ "Hayeren")
+  * [à¤¹à¤¿à¤¨à¥à¤¦à¥](//hi.wikipedia.org/ "HindÄ«")
+  * [Hrvatski](//hr.wikipedia.org/)
+  * [Bahasa Indonesia](//id.wikipedia.org/)
+  * [×¢××¨××ª](//he.wikipedia.org/ "Ivrit")
+  * [á¥áá áá£áá](//ka.wikipedia.org/ "Kartuli")
+  * [Ladin](//lld.wikipedia.org/)
+  * [Latina](//la.wikipedia.org/)
+  * [LatvieÅ¡u](//lv.wikipedia.org/)
+  * [LietuviÅ³](//lt.wikipedia.org/)
+  * [Magyar](//hu.wikipedia.org/)
+  * [ÐÐ°ÐºÐµÐ´Ð¾Ð½ÑÐºÐ¸](//mk.wikipedia.org/ "Makedonski")
+  * [Bahasa Melayu](//ms.wikipedia.org/)
+  * [Bahaso Minangkabau](//min.wikipedia.org/)
+  * [áá¼ááºáá¬áá¬áá¬](//my.wikipedia.org/ "Myanmarsar")
+  * Norsk
+    * [bokmÃ¥l](//no.wikipedia.org/)
+    * [nynorsk](//nn.wikipedia.org/)
+  * [ÐÐ¾Ñ ÑÐ¸Ð¹Ð½](//ce.wikipedia.org/ "NoxÃ§iyn")
+  * [OÊ»zbekcha / ÐÐ·Ð±ÐµÐºÑÐ°](//uz.wikipedia.org/ "OÊ»zbekcha")
+  * [ÒÐ°Ð·Ð°ÒÑÐ° / QazaqÅa / ÙØ§Ø²Ø§ÙØ´Ø§](//kk.wikipedia.org/)
+  * [RomÃ¢nÄ](//ro.wikipedia.org/ "RomÃ¢nÄ")
+  * [Simple English](//simple.wikipedia.org/)
+  * [SlovenÄina](//sk.wikipedia.org/)
+  * [SlovenÅ¡Äina](//sl.wikipedia.org/ "slovenÅ¡Äina")
+  * [Ð¡ÑÐ¿ÑÐºÐ¸ / Srpski](//sr.wikipedia.org/)
+  * [Srpskohrvatski / Ð¡ÑÐ¿ÑÐºÐ¾Ñ ÑÐ²Ð°ÑÑÐºÐ¸](//sh.wikipedia.org/)
+  * [Suomi](//fi.wikipedia.org/ "suomi")
+  * [à®¤à®®à®¿à®´à¯](//ta.wikipedia.org/ "Tamiá¸»")
+  * [Ð¢Ð°ÑÐ°ÑÑÐ° / TatarÃ§a](//tt.wikipedia.org/)
+  * [à¸ à¸²à¸©à¸²à¹à¸à¸¢](//th.wikipedia.org/ "Phasa Thai")
+  * [Ð¢Ð¾Ò·Ð¸ÐºÓ£](//tg.wikipedia.org/ "TojikÄ«")
+  * [ØªÛØ±Ú©Ø¬Ù](//azb.wikipedia.org/ "TÃ¼rkce")
+  * [TÃ¼rkÃ§e](//tr.wikipedia.org/ "TÃ¼rkÃ§e")
+  * [Ø§Ø±Ø¯Ù](//ur.wikipedia.org/ "Urdu")
+  * [ç²µèª](//zh-yue.wikipedia.org/)
+
+##  10,000+  articles
+
+  * [Bahsa AcÃ¨h](//ace.wikipedia.org/)
+  * [Alemannisch](//als.wikipedia.org/)
+  * [á ááá](//am.wikipedia.org/ "ÄmariÃ±Ã±Ä")
+  * [AragonÃ©s](//an.wikipedia.org/)
+  * [Ô±ÖÕ¥ÖÕ´Õ¿Õ¡Õ°Õ¡ÕµÕ¥ÖÕ§Õ¶](//hyw.wikipedia.org/ "Arevmdahayeren")
+  * [Bahasa Hulontalo](//gor.wikipedia.org/)
+  * [Basa Bali](//ban.wikipedia.org/ "Basa Bali")
+  * [Bahasa Banjar](//bjn.wikipedia.org/)
+  * [Basa Banyumasan](//map-bms.wikipedia.org/)
+  * [ÐÐ°ÑÒ¡Ð¾ÑÑÑÐ°](//ba.wikipedia.org/ "BaÅqortsa")
+  * [ÐÐµÐ»Ð°ÑÑÑÐºÐ°Ñ (ÑÐ°ÑÐ°ÑÐºÐµÐ²ÑÑÐ°)](//be-tarask.wikipedia.org/ "Bielaruskaja \(taraÅ¡kievica\)")
+  * [Bikol Central](//bcl.wikipedia.org/)
+  * [à¦¬à¦¿à¦·à§à¦£à§à¦ªà§à¦°à¦¿à¦¯à¦¼à¦¾ à¦®à¦£à¦¿à¦ªà§à¦°à§](//bpy.wikipedia.org/ "Bishnupriya Manipuri")
+  * [Boarisch](//bar.wikipedia.org/)
+  * [Bosanski](//bs.wikipedia.org/)
+  * [Brezhoneg](//br.wikipedia.org/)
+  * [Ð§ÓÐ²Ð°ÑÐ»Ð°](//cv.wikipedia.org/ "ÄÄvaÅ¡la")
+  * [DinÃ© Bizaad](//nv.wikipedia.org/)
+  * [EmigliÃ nâRumagnÃ²l](//eml.wikipedia.org/)
+  * [Fiji Hindi](//hif.wikipedia.org/)
+  * [FÃ¸royskt](//fo.wikipedia.org/)
+  * [Frysk](//fy.wikipedia.org/)
+  * [Gaeilge](//ga.wikipedia.org/)
+  * [GÃ idhlig](//gd.wikipedia.org/)
+  * [àªà«àªàª°àª¾àª¤à«](//gu.wikipedia.org/ "Gujarati")
+  * [Hak-kÃ¢-ngÃ® / å®¢å®¶èª](//hak.wikipedia.org/)
+  * [Hausa](//ha.wikipedia.org/ "Hausa")
+  * [Hornjoserbsce](//hsb.wikipedia.org/)
+  * [Ido](//io.wikipedia.org/ "Ido")
+  * [Igbo](//ig.wikipedia.org/)
+  * [Ilokano](//ilo.wikipedia.org/)
+  * [Interlingua](//ia.wikipedia.org/)
+  * [Interlingue](//ie.wikipedia.org/)
+  * [ÐÑÐ¾Ð½](//os.wikipedia.org/ "Iron")
+  * [Ãslenska](//is.wikipedia.org/)
+  * [Jawa](//jv.wikipedia.org/ "Jawa")
+  * [à²à²¨à³à²¨à²¡](//kn.wikipedia.org/ "Kannada")
+  * [áá¶áá¶ááááá](//km.wikipedia.org/ "PhÃ©asa KhmÃ©r")
+  * [Kotava](//avk.wikipedia.org/)
+  * [KreyÃ²l Ayisyen](//ht.wikipedia.org/)
+  * [KurdÃ® / ÙÙØ±Ø¯Û](//ku.wikipedia.org/)
+  * [Ú©ÙØ±Ø¯ÛÛ ÙØ§ÙÛÙØ¯Û](//ckb.wikipedia.org/ "KurdÃ®y NawendÃ®")
+  * [ÐÑÑÐ³ÑÐ·ÑÐ°](//ky.wikipedia.org/ "KyrgyzÄa")
+  * [ÐÑÑÑÐº Ð¼Ð°ÑÑ](//mrj.wikipedia.org/ "Kyryk Mary")
+  * [LÃ«tzebuergesch](//lb.wikipedia.org/)
+  * [LÃ¬gure](//lij.wikipedia.org/)
+  * [Limburgs](//li.wikipedia.org/)
+  * [Lombard](//lmo.wikipedia.org/)
+  * [à¤®à¥à¤¥à¤¿à¤²à¥](//mai.wikipedia.org/ "MaithilÄ«")
+  * [Malagasy](//mg.wikipedia.org/)
+  * [à´®à´²à´¯à´¾à´³à´](//ml.wikipedia.org/ "Malayalam")
+  * [à¤®à¤°à¤¾à¤ à¥](//mr.wikipedia.org/ "Marathi")
+  * [ááá áááá£á á](//xmf.wikipedia.org/ "Margaluri")
+  * [Ù Ø§Ø²ÙØ±ÙÙÛ](//mzn.wikipedia.org/ "MÃ¤zeruni")
+  * [MÃ¬ng-dÄÌ¤ng-ngá¹³Ì / é©æ±èª](//cdo.wikipedia.org/ "Ming-deng-ngu")
+  * [ÐÐ¾Ð½Ð³Ð¾Ð»](//mn.wikipedia.org/ "Mongol")
+  * [Napulitano](//nap.wikipedia.org/)
+  * [à¤¨à¥à¤ªà¤¾à¤² à¤à¤¾à¤·à¤¾](//new.wikipedia.org/ "Nepal Bhasa")
+  * [à¤¨à¥à¤ªà¤¾à¤²à¥](//ne.wikipedia.org/ "NepÄlÄ«")
+  * [Nordfriisk](//frr.wikipedia.org/)
+  * [Occitan](//oc.wikipedia.org/)
+  * [ÐÐ»ÑÐº Ð¼Ð°ÑÐ¸Ð¹](//mhr.wikipedia.org/ "Olyk Marij")
+  * [à¬à¬¡à¬¿à¬¼à¬](//or.wikipedia.org/ "Oá¹iÄ")
+  * [à¦ à¦¸à¦®à§à¦¯à¦¾à¦¼](//as.wikipedia.org/ "ÃxÃ´miya")
+  * [à¨ªà©°à¨à¨¾à¨¬à©](//pa.wikipedia.org/ "PaÃ±jÄbÄ« \(GurmukhÄ«\)")
+  * [Ù¾ÙØ¬Ø§Ø¨Û (Ø´Ø§Û Ù Ú©Ú¾Û)](//pnb.wikipedia.org/ "PaÃ±jÄbÄ« \(ShÄhmukhÄ«\)")
+  * [Ù¾ÚØªÙ](//ps.wikipedia.org/ "PaÊto")
+  * [PiemontÃ¨is](//pms.wikipedia.org/)
+  * [PlattdÃ¼Ã¼tsch](//nds.wikipedia.org/)
+  * [QÄ±rÄ±mtatarca](//crh.wikipedia.org/)
+  * [Runa Simi](//qu.wikipedia.org/)
+  * [à¤¸à¤à¤¸à¥à¤à¥à¤¤à¤®à¥](//sa.wikipedia.org/ "Saá¹ská¹tam")
+  * [á±¥á±á±±á±á±á±²á±¤](//sat.wikipedia.org/ "Santali")
+  * [Ð¡Ð°Ñ Ð° Ð¢ÑÐ»Ð°](//sah.wikipedia.org/ "Saxa Tyla")
+  * [Scots](//sco.wikipedia.org/)
+  * [ChiShona](//sn.wikipedia.org/)
+  * [Shqip](//sq.wikipedia.org/)
+  * [Sicilianu](//scn.wikipedia.org/)
+  * [à·à·à¶à·à¶½](//si.wikipedia.org/ "Siá¹hala")
+  * [Ø³ÙÚÙ](//sd.wikipedia.org/ "SindhÄ«")
+  * [ÅlÅ¯nski](//szl.wikipedia.org/)
+  * [Basa Sunda](//su.wikipedia.org/)
+  * [Kiswahili](//sw.wikipedia.org/)
+  * [Tagalog](//tl.wikipedia.org/)
+  * [á½áááááááá¸](//shn.wikipedia.org/)
+  * [à°¤à±à°²à±à°à±](//te.wikipedia.org/ "Telugu")
+  * [chiTumbuka](//tum.wikipedia.org/)
+  * [Basa Ugi](//bug.wikipedia.org/)
+  * [VÃ¨neto](//vec.wikipedia.org/)
+  * [VolapÃ¼k](//vo.wikipedia.org/)
+  * [Walon](//wa.wikipedia.org/)
+  * [æè¨](//zh-classical.wikipedia.org/ "WÃ©nyÃ¡n")
+  * [å´è¯](//wuu.wikipedia.org/ "WÃºyÇ")
+  * [××Ö´×××©](//yi.wikipedia.org/ "YidiÅ¡")
+  * [YorÃ¹bÃ¡](//yo.wikipedia.org/)
+  * [Zazaki](//diq.wikipedia.org/ "Zazaki")
+  * [Å¾emaitÄÅ¡ka](//bat-smg.wikipedia.org/)
+  * [isiZulu](//zu.wikipedia.org/)
+  * [ê¯ê¯¤ê¯ê¯© ê¯ê¯£ê¯](//mni.wikipedia.org/)
+
+##  1,000+  articles
+
+  * [Dzhudezmo / ××××× ×](//lad.wikipedia.org/)
+  * [ÐÐ´ÑÐ³ÑÐ±Ð·Ñ](//kbd.wikipedia.org/ "Adighabze")
+  * [Ãnglisc](//ang.wikipedia.org/)
+  * [AnarÃ¢Å¡kielÃ¢](//smn.wikipedia.org/ "anarÃ¢Å¡kielÃ¢")
+  * [à¤ à¤à¤à¤¿à¤à¤¾](//anp.wikipedia.org/ "Angika")
+  * [ÐÔ¥ÑÑÓÐ°](//ab.wikipedia.org/ "aá¹sshwa")
+  * [armÃ£neashti](//roa-rup.wikipedia.org/)
+  * [Arpitan](//frp.wikipedia.org/)
+  * [atikamekw](//atj.wikipedia.org/)
+  * [ÜÜ¬ÜÜªÜÜ](//arc.wikipedia.org/ "ÄtÃ»rÄyÃ¢")
+  * [AvaÃ±eâáº½](//gn.wikipedia.org/)
+  * [ÐÐ²Ð°Ñ](//av.wikipedia.org/ "Avar")
+  * [Aymar](//ay.wikipedia.org/)
+  * [à¤à¥à¤à¤ªà¥à¤°à¥](//bh.wikipedia.org/ "BhÅjapurÄ«")
+  * [Bislama](//bi.wikipedia.org/)
+  * [à½à½¼à½à¼à½¡à½²à½](//bo.wikipedia.org/ "Bod Skad")
+  * [ÐÑÑÑÐ°Ð´](//bxr.wikipedia.org/ "Buryad")
+  * [Chavacano de Zamboanga](//cbk-zam.wikipedia.org/)
+  * [Chichewa](//ny.wikipedia.org/)
+  * [Corsu](//co.wikipedia.org/)
+  * [Vahcuengh / è©±å®](//za.wikipedia.org/)
+  * [Dagaare](//dga.wikipedia.org/)
+  * [Dagbanli](//dag.wikipedia.org/)
+  * [Ø§ÙØ¯Ø§Ø±Ø¬Ø©](//ary.wikipedia.org/ "Darija")
+  * [DavvisÃ¡megiella](//se.wikipedia.org/ "davvisÃ¡megiella")
+  * [Deitsch](//pdc.wikipedia.org/)
+  * [ÞÞ¨ÞÞ¬ÞÞ¨ÞÞ¦ÞÞ°](//dv.wikipedia.org/ "Divehi")
+  * [Dolnoserbski](//dsb.wikipedia.org/)
+  * [ÐÑÐ·ÑÐ½Ñ](//myv.wikipedia.org/ "Erzjanj")
+  * [EstremeÃ±u](//ext.wikipedia.org/)
+  * [Fulfulde](//ff.wikipedia.org/)
+  * [Furlan](//fur.wikipedia.org/)
+  * [Gaelg](//gv.wikipedia.org/)
+  * [Gagauz](//gag.wikipedia.org/)
+  * [ÐÓÐ°Ð»Ð³ÓÐ°Ð¹](//inh.wikipedia.org/ "Ghalghai")
+  * [GÄ©kÅ©yÅ©](//ki.wikipedia.org/)
+  * [Ú¯ÛÙÚ©Û](//glk.wikipedia.org/ "GilÉki")
+  * [èµ£è¯ / è´èª](//gan.wikipedia.org/ "Gon ua")
+  * [Gungbe](//guw.wikipedia.org/)
+  * [Ð¥Ð°Ð»ÑÐ¼Ð³](//xal.wikipedia.org/ "HalÊ¹mg")
+  * [Ê»Ålelo HawaiÊ»i](//haw.wikipedia.org/)
+  * [Ikinyarwanda](//rw.wikipedia.org/)
+  * [KabÉ©yÉ](//kbp.wikipedia.org/)
+  * [Kapampangan](//pam.wikipedia.org/)
+  * [KaszÃ«bsczi](//csb.wikipedia.org/)
+  * [Kernewek](//kw.wikipedia.org/)
+  * [ÐÐ¾Ð¼Ð¸](//kv.wikipedia.org/ "Komi")
+  * [ÐÐµÑÐµÐ¼ ÐºÐ¾Ð¼Ð¸](//koi.wikipedia.org/ "Perem Komi")
+  * [Kongo](//kg.wikipedia.org/)
+  * [à¤à¥à¤à¤à¤£à¥ / Konknni](//gom.wikipedia.org/)
+  * [ÙÙ²Ø´ÙØ±](//ks.wikipedia.org/ "Koshur")
+  * [KriyÃ²l Gwiyannen](//gcr.wikipedia.org/ "KriyÃ²l Gwiyannen")
+  * [àºàº²àºªàº²àº¥àº²àº§](//lo.wikipedia.org/ "Phaasaa Laao")
+  * [ÐÐ°ÐºÐºÑ](//lbe.wikipedia.org/ "Lakku")
+  * [LatgaÄ¼u](//ltg.wikipedia.org/)
+  * [ÐÐµÐ·Ð³Ð¸](//lez.wikipedia.org/ "Lezgi")
+  * [Li Niha](//nia.wikipedia.org/)
+  * [LingÃ¡la](//ln.wikipedia.org/)
+  * [Lingua Franca Nova](//lfn.wikipedia.org/)
+  * [livvinkarjala](//olo.wikipedia.org/)
+  * [lojban](//jbo.wikipedia.org/)
+  * [Luganda](//lg.wikipedia.org/)
+  * [MadhurÃ¢](//mad.wikipedia.org/)
+  * [Malti](//mt.wikipedia.org/)
+  * [MÄori](//mi.wikipedia.org/)
+  * [Twi](//tw.wikipedia.org/ "Mfantse")
+  * [MirandÃ©s](//mwl.wikipedia.org/)
+  * [ÐÐ¾ÐºÑÐµÐ½Ñ](//mdf.wikipedia.org/ "MokÅ¡enj")
+  * [áá¬áá¬ áááº](//mnw.wikipedia.org/)
+  * [ßßß](//nqo.wikipedia.org/ "N'Ko")
+  * [Na Vosa Vaka-Viti](//fj.wikipedia.org/)
+  * [NÄhuatlahtÅlli](//nah.wikipedia.org/)
+  * [NaijÃ¡](//pcm.wikipedia.org/)
+  * [Nedersaksisch](//nds-nl.wikipedia.org/)
+  * [Nouormand / Normaund](//nrm.wikipedia.org/)
+  * [Novial](//nov.wikipedia.org/)
+  * [Afaan Oromoo](//om.wikipedia.org/ "Ingiliffaa")
+  * [áá¡áá¯ááºááá¬ááá¬á](//blk.wikipedia.org/)
+  * [à¤ªà¤¾à¤²à¤¿](//pi.wikipedia.org/ "PÄá¸·i")
+  * [PangasinÃ¡n](//pag.wikipedia.org/)
+  * [Pangcah](//ami.wikipedia.org/)
+  * [Papiamentu](//pap.wikipedia.org/)
+  * [Patois](//jam.wikipedia.org/)
+  * [PfÃ¤lzisch](//pfl.wikipedia.org/)
+  * [Picard](//pcd.wikipedia.org/)
+  * [ÐÑÐ°ÑÐ°ÑÐ°Ð¹âÐ¼Ð°Ð»ÐºÑÐ°Ñ](//krc.wikipedia.org/ "QaraÃ§ayâMalqar")
+  * [Qaraqalpaqsha](//kaa.wikipedia.org/ "Qaraqalpaqsha")
+  * [Ripoarisch](//ksh.wikipedia.org/)
+  * [Rumantsch](//rm.wikipedia.org/)
+  * [Ð ÑÑÐ¸Ð½ÑÑÐºÑÐ¹](//rue.wikipedia.org/ "Rusinâskyj")
+  * [Sakizaya](//szy.wikipedia.org/)
+  * [Gagana SÄmoa](//sm.wikipedia.org/)
+  * [Ø³Ø±Ø§Ø¦ÛÚ©Û](//skr.wikipedia.org/ "Saraiki")
+  * [Sardu](//sc.wikipedia.org/ "Sardu")
+  * [Seediq](//trv.wikipedia.org/)
+  * [Seeltersk](//stq.wikipedia.org/)
+  * [Sesotho sa Leboa](//nso.wikipedia.org/)
+  * [Setswana](//tn.wikipedia.org/)
+  * [Ð¡Ð»Ð¾Ð²Ñ£ÌÐ½ÑÑÐºÑ / â°â°â°â°â°¡â°â° â°â°â°](//cu.wikipedia.org/ "SlovÄnÄskÅ")
+  * [Soomaaliga](//so.wikipedia.org/)
+  * [Sranantongo](//srn.wikipedia.org/)
+  * [Taclá¸¥it](//shi.wikipedia.org/)
+  * [Reo tahiti](//ty.wikipedia.org/)
+  * [âµâ´°âµâ´°âµ£âµâµâµ âµâ´°âµâ´°âµ¡â´°âµ¢âµ](//zgh.wikipedia.org/ "Tamazight tanawayt")
+  * [Taqbaylit](//kab.wikipedia.org/ "Taqbaylit")
+  * [TarandÃne](//roa-tara.wikipedia.org/)
+  * [Tayal](//tay.wikipedia.org/)
+  * [Tetun](//tet.wikipedia.org/)
+  * [Tok Pisin](//tpi.wikipedia.org/)
+  * [tolÄ±Åi](//tly.wikipedia.org/)
+  * [faka Tonga](//to.wikipedia.org/)
+  * [á£á³á©](//chr.wikipedia.org/ "Tsalagi")
+  * [TÃ¼rkmenÃ§e](//tk.wikipedia.org/)
+  * [Ð¢ÑÐ²Ð° Ð´ÑÐ»](//tyv.wikipedia.org/ "Tyva dyl")
+  * [Ð£Ð´Ð¼ÑÑÑ](//udm.wikipedia.org/ "Udmurt")
+  * [Ø¦ÛÙØºÛØ±ÚÙ](//ug.wikipedia.org/)
+  * [VepsÃ¤n](//vep.wikipedia.org/)
+  * [vÃµro](//fiu-vro.wikipedia.org/)
+  * [West-Vlams](//vls.wikipedia.org/)
+  * [Wolof](//wo.wikipedia.org/)
+  * [isiXhosa](//xh.wikipedia.org/)
+  * [ZeÃªuws](//zea.wikipedia.org/)
+  * [Ð°Ð»ÑÐ°Ð¹ ÑÐ¸Ð»](//alt.wikipedia.org/)
+  * [à¤ à¤µà¤§à¥](//awa.wikipedia.org/)
+  * [à¤¡à¥à¤à¥à¤²à¥](//dty.wikipedia.org/)
+  * [à²¤à³à²³à³](//tcy.wikipedia.org/)
+
+##  100+  articles
+
+  * [Bamanankan](//bm.wikipedia.org/)
+  * [Batak Toba](//bbc.wikipedia.org/)
+  * [Chamoru](//ch.wikipedia.org/)
+  * [à½¢à¾«à½¼à½à¼à½](//dz.wikipedia.org/ "Rdzong-Kha")
+  * [EÊegbe](//ee.wikipedia.org/)
+  * [Farefare](//gur.wikipedia.org/)
+  * [FÉÌngbÃ¨](//fon.wikipedia.org/)
+  * [Ghanaian Pidgin](//gpe.wikipedia.org/)
+  * [ð²ð¿ðð¹ððº](//got.wikipedia.org/ "Gutisk")
+  * [áááááá¦ / Inuktitut](//iu.wikipedia.org/)
+  * [IÃ±upiak](//ik.wikipedia.org/)
+  * [Kalaallisut](//kl.wikipedia.org/)
+  * [Mfantse](//fat.wikipedia.org/)
+  * [Norfuk / Pitkern](//pih.wikipedia.org/)
+  * [pinayuanan](//pwn.wikipedia.org/)
+  * [Î Î¿Î½ÏÎ¹Î±ÎºÎ¬](//pnt.wikipedia.org/ "PontiakÃ¡")
+  * [romani Ähib](//rmy.wikipedia.org/)
+  * [Ikirundi](//rn.wikipedia.org/)
+  * [SÃ¤ngÃ¶](//sg.wikipedia.org/)
+  * [Sesotho](//st.wikipedia.org/)
+  * [SiSwati](//ss.wikipedia.org/)
+  * [áµááá](//ti.wikipedia.org/ "TÉgÉrÉÃ±a")
+  * [ThuÉÅjÃ¤Å](//din.wikipedia.org/)
+  * [TsÄhesenÄstsestotse](//chy.wikipedia.org/)
+  * [Xitsonga](//ts.wikipedia.org/)
+  * [Tyap](//kcg.wikipedia.org/)
+  * [Tshivená¸a](//ve.wikipedia.org/)
+  * [Wayuunaiki](//guc.wikipedia.org/)
+  * [Ð°Ð´ÑÐ³Ð°Ð±Ð·Ñ](//ady.wikipedia.org/)
+
+[Other
+languages](https://meta.wikimedia.org/wiki/Special:MyLanguage/List_of_Wikipedias)
+
+* * *
+
+Wikipedia is hosted by the Wikimedia Foundation, a non-profit organization
+that also hosts a range of other projects.
+
+[ You can support our work with a donation.
+](https://donate.wikimedia.org/?utm_medium=portal&utm_campaign=portalFooter&utm_source=portalFooter)
+
+**[ Download Wikipedia for Android or iOS
+](https://en.wikipedia.org/wiki/List_of_Wikipedia_mobile_applications) **
+
+Save your favorite articles to read offline, sync your reading lists across
+devices and customize your reading experience with the official Wikipedia app.
+
+  * [ Google Play Store ](https://play.google.com/store/apps/details?id=org.wikipedia&referrer=utm_source%3Dportal%26utm_medium%3Dbutton%26anid%3Dadmob)
+  * [ Apple App Store ](https://itunes.apple.com/app/apple-store/id324715238?pt=208305&ct=portal&mt=8)
+
+[ Commons Freely usable photos & more ](//commons.wikimedia.org/)
+
+[ Wikivoyage Free travel guide ](//www.wikivoyage.org/)
+
+[ Wiktionary Free dictionary ](//www.wiktionary.org/)
+
+[ Wikibooks Free textbooks ](//www.wikibooks.org/)
+
+[ Wikinews Free news source ](//www.wikinews.org/)
+
+[ Wikidata Free knowledge base ](//www.wikidata.org/)
+
+[ Wikiversity Free course materials ](//www.wikiversity.org/)
+
+[ Wikiquote Free quote compendium ](//www.wikiquote.org/)
+
+[ MediaWiki Free & open wiki application ](//www.mediawiki.org/)
+
+[ Wikisource Free library ](//www.wikisource.org/)
+
+[ Wikispecies Free species directory ](//species.wikimedia.org/)
+
+[ Wikifunctions Free function library ](//www.wikifunctions.org/)
+
+[ Meta-Wiki Community coordination & documentation ](//meta.wikimedia.org/)
+
+* * *
+
+This page is available under the [Creative Commons Attribution-ShareAlike
+License](https://creativecommons.org/licenses/by-sa/4.0/) [Terms of
+Use](https://meta.wikimedia.org/wiki/Terms_of_use) [Privacy
+Policy](https://meta.wikimedia.org/wiki/Privacy_policy)
+
--- a/tools/caption.py
+++ b/tools/caption.py
@ -3,67 +3,58 @@
 # eg: python caption.py D:\some\folder\location "*.png, *.jpg, *.webp" "some caption text"

 import argparse
-# import glob
-# import os
+import os
+import logging
 from pathlib import Path

-def create_caption_files(image_folder: str, file_pattern: str, caption_text: str, caption_file_ext: str, overwrite: bool):
-    # Split the file patterns string and strip whitespace from each pattern
+def create_caption_files(image_folder: Path, file_pattern: str, caption_text: str, caption_file_ext: str, overwrite: bool):
+    # Split the file patterns string and remove whitespace from each extension
    patterns = [pattern.strip() for pattern in file_pattern.split(",")]

-    # Create a Path object for the image folder
-    folder = Path(image_folder)
-
-    # Iterate over the file patterns
+    # Use the glob method to match the file pattern
    for pattern in patterns:
-        # Use the glob method to match the file patterns
-        files = folder.glob(pattern)
+        files = image_folder.glob(pattern)

        # Iterate over the matched files
        for file in files:
            # Check if a text file with the same name as the current file exists in the folder
            txt_file = file.with_suffix(caption_file_ext)
            if not txt_file.exists() or overwrite:
-                # Create a text file with the caption text in the folder, if it does not already exist
-                # or if the overwrite argument is True
-                with open(txt_file, "w") as f:
-                    f.write(caption_text)
+                txt_file.write_text(caption_text)
+                logging.info(f"Caption file created: {txt_file}")
+                
+def writable_dir(target_path):
+    """ Check if a path is a valid directory and that it can be written to. """
+    path = Path(target_path)
+    if path.is_dir():
+        if os.access(path, os.W_OK):
+            return path
+        else:
+            raise argparse.ArgumentTypeError(f"Directory '{path}' is not writable.")
+    else:
+        raise argparse.ArgumentTypeError(f"Directory '{path}' does not exist.")

 def main():
+    # Set up logging
+    logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
+    
    # Define command-line arguments
    parser = argparse.ArgumentParser()
-    parser.add_argument("image_folder", type=str, help="the folder where the image files are located")
+    parser.add_argument("image_folder", type=writable_dir, help="The folder where the image files are located")
    parser.add_argument("--file_pattern", type=str, default="*.png, *.jpg, *.jpeg, *.webp", help="the pattern to match the image file names")
    parser.add_argument("--caption_file_ext", type=str, default=".caption", help="the caption file extension.")
    parser.add_argument("--overwrite", action="store_true", default=False, help="whether to overwrite existing caption files")

    # Create a mutually exclusive group for the caption_text and caption_file arguments
-    group = parser.add_mutually_exclusive_group()
-    group.add_argument("--caption_text", type=str, help="the text to include in the caption files")
-    group.add_argument("--caption_file", type=argparse.FileType("r"), help="the file containing the text to include in the caption files")
+    caption_group = parser.add_mutually_exclusive_group(required=True)
+    caption_group.add_argument("--caption_text", type=str, help="the text to include in the caption files")
+    caption_group.add_argument("--caption_file", type=argparse.FileType("r"), help="the file containing the text to include in the caption files")

    # Parse the command-line arguments
    args = parser.parse_args()
-    image_folder = args.image_folder
-    file_pattern = args.file_pattern
-    caption_file_ext = args.caption_file_ext
-    overwrite = args.overwrite

-    # Get the caption text from either the caption_text or caption_file argument
-    if args.caption_text:
-        caption_text = args.caption_text
-    elif args.caption_file:
-        caption_text = args.caption_file.read()
-
-    # Create a Path object for the image folder
-    folder = Path(image_folder)
-
-    # Check if the image folder exists and is a directory
-    if not folder.is_dir():
-        raise ValueError(f"{image_folder} is not a valid directory.")
-        
    # Create the caption files
-    create_caption_files(image_folder, file_pattern, caption_text, caption_file_ext, overwrite)
+    create_caption_files(args.image_folder, args.file_pattern, args.caption_text, args.caption_file_ext, args.overwrite)

 if __name__ == "__main__":
    main()
--- a/tools/caption_from_filename.py
+++ b/tools/caption_from_filename.py
@ -1,50 +1,99 @@
 # Proposed by https://github.com/kainatquaderee
 import os
 import argparse
+import logging
+from pathlib import Path
+
+def is_image_file(filename, image_extensions):
+    """Check if a file is an image file based on its extension."""
+    return Path(filename).suffix.lower() in image_extensions
+
+def create_text_file(image_filename, output_directory, text_extension):
+    """Create a text file with the same name as the image file."""
+    # Extract prompt from filename
+    prompt = Path(image_filename).stem
+
+    # Construct path for the output text file
+    text_file_path = Path(output_directory) / (prompt + text_extension)
+    try:
+
+        # Write prompt to text file
+        with open(text_file_path, 'w') as text_file:
+            text_file.write(prompt)
+
+        logging.info(f"Text file created: {text_file_path}")
+
+        return 1
+
+    except IOError as e:
+        logging.error(f"Failed to write to {text_file_path}: {e}")
+        return 0

 def main(image_directory, output_directory, image_extension, text_extension):
+    # If no output directory is provided, use the image directory
+    if not output_directory:
+        output_directory = image_directory
+
    # Ensure the output directory exists, create it if necessary
-    os.makedirs(output_directory, exist_ok=True)
+    Path(output_directory).mkdir(parents=True, exist_ok=True)

    # Initialize a counter for the number of text files created
    text_files_created = 0

    # Iterate through files in the directory
-    for image_filename in os.listdir(image_directory):
+    for image_filename in Path(image_directory).iterdir():
        # Check if the file is an image
-        if any(image_filename.lower().endswith(ext) for ext in image_extension):
-            # Extract prompt from filename
-            prompt = os.path.splitext(image_filename)[0]
-
-            # Construct path for the output text file
-            text_file_path = os.path.join(output_directory, prompt + text_extension)
-
-            # Write prompt to text file
-            with open(text_file_path, 'w') as text_file:
-                text_file.write(prompt)
-
-            print(f"Text file saved: {text_file_path}")
-
-            # Increment the counter
-            text_files_created += 1
+        if is_image_file(image_filename, image_extension):
+            # Create a text file with the same name as the image file and increment the counter if successful
+            text_files_created += create_text_file(image_filename, output_directory, text_extension)

    # Report if no text files were created
    if text_files_created == 0:
-        print("No image matching extensions were found in the specified directory. No caption files were created.")
+        logging.info("No image matching extensions were found in the specified directory. No caption files were created.")
    else:
-        print(f"{text_files_created} text files created successfully.")
+        logging.info(f"{text_files_created} text files created successfully.")
+
+def create_gui(image_directory, output_directory, image_extension, text_extension):
+    try:
+        import gradio
+        import gradio.blocks as blocks
+    except ImportError:
+        print("gradio module is not installed. Please install it to use the GUI.")
+        exit(1)
+    
+    """Create a Gradio interface for the caption creation process."""
+    with gradio.Blocks() as demo:
+        gradio.Markdown("## Caption From Filename")
+        with gradio.Row():
+            with gradio.Column():
+                image_dir = gradio.Textbox(label="Image Directory", value=image_directory)
+                output_dir = gradio.Textbox(label="Output Directory", value=output_directory)
+                image_ext = gradio.Textbox(label="Image Extensions", value=" ".join(image_extension))
+                text_ext = gradio.Textbox(label="Text Extension", value=text_extension)
+                run_button = gradio.Button("Run")
+            with gradio.Column():
+                output = gradio.Textbox(label="Output", placeholder="Output will be displayed here...", lines=10, max_lines=10)
+        run_button.click(main, inputs=[image_dir, output_dir, image_ext, text_ext], outputs=output)
+    demo.launch()

 if __name__ == "__main__":
+    # Set up logging
+    logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
+
    # Create an argument parser
    parser = argparse.ArgumentParser(description='Generate caption files from image filenames.')

    # Add arguments for the image directory, output directory, and file extension
-    parser.add_argument('image_directory', help='Directory containing the image files')
-    parser.add_argument('output_directory', help='Output directory where text files will be saved')
-    parser.add_argument('--image_extension', nargs='+', default=['.jpg', '.jpeg', '.png', '.webp', '.bmp'], help='Extension for the image files')
-    parser.add_argument('--text_extension', default='.txt', help='Extension for the output text files')
+    parser.add_argument('image_directory', help='Directory containing the image files.')
+    parser.add_argument('--output_directory', help='Optional: Output directory where text files will be saved. If not provided, the files will be saved in the same directory as the images.')
+    parser.add_argument('--image_extension', nargs='+', default=['.jpg', '.jpeg', '.png', '.webp', '.bmp'], help='Extension(s) for the image files. Defaults to common image extensions .jpg, .jpeg, .png, .webp, .bmp.')
+    parser.add_argument('--text_extension', default='.txt', help='Extension for the output text files. Defaults to .txt.')
+    parser.add_argument('--gui', action='store_true', help='Launch a Gradio interface for the caption creation process.')

    # Parse the command-line arguments
    args = parser.parse_args()

-    main(args.image_directory, args.output_directory, args.image_extension, args.text_extension)
+    if args.gui:
+        create_gui(args.image_directory, args.output_directory, args.image_extension, args.text_extension)
+    else:
+        main(args.image_directory, args.output_directory, args.image_extension, args.text_extension)
--- a/tools/cleanup_captions.py
+++ b/tools/cleanup_captions.py
@ -1,27 +1,53 @@
 import os
 import argparse
+import logging
+from pathlib import Path

-parser = argparse.ArgumentParser(description="Remove specified keywords from all text files in a directory.")
-parser.add_argument("folder_path", type=str, help="path to directory containing text files")
-parser.add_argument("-e", "--extension", type=str, default=".txt", help="file extension of text files to be processed (default: .txt)")
-args = parser.parse_args()
+def writable_dir(target_path):
+    """ Check if a path is a valid directory and that it can be written to. """
+    path = Path(target_path)
+    if path.is_dir():
+        if os.access(path, os.W_OK):
+            return path
+        else:
+            raise argparse.ArgumentTypeError(f"Directory '{path}' is not writable.")
+    else:
+        raise argparse.ArgumentTypeError(f"Directory '{path}' does not exist.")
+    
+def main(folder_path:Path, extension:str, keywords:set=None):
+    for file_name in os.listdir(folder_path):
+        if file_name.endswith(extension):
+            file_path = os.path.join(folder_path, file_name)
+            try:
+                with open(file_path, "r") as f:
+                    text = f.read()
+                # extract tags from text and split into a list using comma as the delimiter
+                tags = [tag.strip() for tag in text.split(",")]
+                # remove the specified keywords from the tags list
+                if keywords:
+                    tags = [tag for tag in tags if tag not in keywords]
+                # remove empty or whitespace-only tags
+                tags = [tag for tag in tags if tag.strip() != ""]
+                # join the tags back into a comma-separated string and write back to the file
+                with open(file_path, "w") as f:
+                    f.write(", ".join(tags))
+                logging.info(f"Processed {file_name}")
+            except Exception as e:
+                logging.error(f"Error processing {file_name}: {e}")

-folder_path = args.folder_path
-extension = args.extension
-keywords = ["1girl", "solo", "blue eyes", "brown eyes", "blonde hair", "black hair", "realistic", "red lips", "lips", "artist name", "makeup", "realistic","brown hair", "dark skin", 
-            "dark-skinned female", "medium breasts", "breasts", "1boy"]
+if __name__ == "__main__":
+    # Set up logging
+    logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')

-for file_name in os.listdir(folder_path):
-    if file_name.endswith(extension):
-        file_path = os.path.join(folder_path, file_name)
-        with open(file_path, "r") as f:
-            text = f.read()
-        # extract tags from text and split into a list using comma as the delimiter
-        tags = [tag.strip() for tag in text.split(",")]
-        # remove the specified keywords from the tags list
-        tags = [tag for tag in tags if tag not in keywords]
-        # remove empty or whitespace-only tags
-        tags = [tag for tag in tags if tag.strip() != ""]
-        # join the tags back into a comma-separated string and write back to the file
-        with open(file_path, "w") as f:
-            f.write(", ".join(tags))
+    parser = argparse.ArgumentParser(description="Remove specified keywords from all text files in a directory.")
+    parser.add_argument("folder_path", type=writable_dir, help="path to directory containing text files")
+    parser.add_argument("-e", "--extension", type=str, default=".txt", help="file extension of text files to be processed (default: .txt)")
+    parser.add_argument("-k", "--keywords", type=str, nargs="*", help="Optional: list of keywords to be removed from text files. If not provided, the default list will be used.")
+    args = parser.parse_args()
+
+    folder_path = args.folder_path
+    extension = args.extension
+    keywords = set(args.keywords) if args.keywords else set(["1girl", "solo", "blue eyes", "brown eyes", "blonde hair", "black hair", "realistic", "red lips", "lips", "artist name", "makeup", "realistic","brown hair", "dark skin", 
+                "dark-skinned female", "medium breasts", "breasts", "1boy"])
+
+    main(folder_path, extension, keywords)
--- a/tools/convert_html_to_md.py
+++ b/tools/convert_html_to_md.py
@ -1,39 +1,64 @@
+import argparse
 import os
 import requests
 from bs4 import BeautifulSoup
 from urllib.parse import urljoin
 from html2text import html2text
+from pathlib import Path

-# Specify the URL of the webpage you want to scrape
-url = "https://hoshikat-hatenablog-com.translate.goog/entry/2023/05/26/223229?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=en-US&_x_tr_pto=wapp"
+def is_writable_path(target_path):
+    """
+    Check if a path is writable.
+    """
+    path = Path(os.path.dirname(target_path))
+    if path.is_dir():
+        if os.access(path, os.W_OK):
+            return target_path
+        else:
+            raise argparse.ArgumentTypeError(f"Directory '{path}' is not writable.")
+    else:
+        raise argparse.ArgumentTypeError(f"Directory '{path}' does not exist.")

-# Send HTTP request to the specified URL and save the response from server in a response object called r
-r = requests.get(url)
+def main(url, markdown_path):
+    # Create a session object
+    with requests.Session() as session:
+        # Send HTTP request to the specified URL
+        response = session.get(url)
+        response.raise_for_status()  # Check for HTTP issues

-# Create a BeautifulSoup object and specify the parser
-soup = BeautifulSoup(r.text, 'html.parser')
+        # Create a BeautifulSoup object and specify the parser
+        soup = BeautifulSoup(response.text, 'html.parser')

-# Find all image tags
-images = soup.find_all('img')
+        # Ensure the directory for saving images exists
+        os.makedirs("./logs", exist_ok=True)

-for image in images:
-    # Get the image source
-    image_url = urljoin(url, image['src'])
-    
-    # Get the image response
-    image_response = requests.get(image_url, stream=True)
-    
-    # Get the image name by splitting the url at / and taking the last string, and add it to the desired path
-    image_name = os.path.join("./logs", image_url.split("/")[-1])
-    
-    # Open the image file in write binary mode
-    with open(image_name, 'wb') as file:
-        # Write the image data to the file
-        file.write(image_response.content)
+        # Find all image tags and save images
+        for image in soup.find_all('img'):
+            image_url = urljoin(url, image['src'])
+            try:
+                image_response = session.get(image_url, stream=True)
+                image_response.raise_for_status()
+                image_name = os.path.join("./logs", os.path.basename(image_url))
+                with open(image_name, 'wb') as file:
+                    file.write(image_response.content)
+            except requests.RequestException as e:
+                print(f"Failed to download {image_url}: {e}")

-# Convert the HTML content to markdown
-markdown_content = html2text(r.text)
+        # Convert the HTML content to markdown
+        markdown_content = html2text(response.text)

-# Save the markdown content to a file
-with open("converted_markdown.md", "w", encoding="utf8") as file:
-    file.write(markdown_content)
+        # Save the markdown content to a file
+        try:
+            with open(markdown_path, "w", encoding="utf8") as file:
+                file.write(markdown_content)
+            print(f"Markdown content successfully written to {markdown_path}")
+        except Exception as e:
+            print(f"Failed to write markdown to {markdown_path}: {e}")
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Convert HTML to Markdown")
+    parser.add_argument("url", help="The URL of the webpage to convert")
+    parser.add_argument("markdown_path", help="The path to save the converted markdown file", type=is_writable_path)
+    args = parser.parse_args()
+
+    main(args.url, args.markdown_path)
--- a/tools/convert_images_to_hq_jpg.py
+++ b/tools/convert_images_to_hq_jpg.py
@ -5,25 +5,19 @@ from pathlib import Path
 from PIL import Image


-def main():
-    # Define the command-line arguments
-    parser = argparse.ArgumentParser()
-    parser.add_argument("directory", type=str,
-                        help="the directory containing the images to be converted")
-    parser.add_argument("--in_ext", type=str, default="webp",
-                        help="the input file extension")
-    parser.add_argument("--quality", type=int, default=95,
-                        help="the JPEG quality (0-100)")
-    parser.add_argument("--delete_originals", action="store_true",
-                        help="whether to delete the original files after conversion")
+def writable_dir(target_path):
+    """ Check if a path is a valid directory and that it can be written to. """
+    path = Path(target_path)
+    if path.is_dir():
+        if os.access(path, os.W_OK):
+            return path
+        else:
+            raise argparse.ArgumentTypeError(f"Directory '{path}' is not writable.")
+    else:
+        raise argparse.ArgumentTypeError(f"Directory '{path}' does not exist.")

-    # Parse the command-line arguments
-    args = parser.parse_args()
-    directory = args.directory
-    in_ext = args.in_ext
+def main(directory, in_ext, quality, delete_originals):
    out_ext = "jpg"
-    quality = args.quality
-    delete_originals = args.delete_originals

    # Create the file pattern string using the input file extension
    file_pattern = f"*.{in_ext}"
@ -54,4 +48,18 @@ def main():


 if __name__ == "__main__":
-    main()
+    # Define the command-line arguments
+    parser = argparse.ArgumentParser()
+    parser.add_argument("directory", type=writable_dir,
+                        help="the directory containing the images to be converted")
+    parser.add_argument("--in_ext", type=str, default="webp",
+                        help="the input file extension")
+    parser.add_argument("--quality", type=int, default=95,
+                        help="the JPEG quality (0-100)")
+    parser.add_argument("--delete_originals", action="store_true",
+                        help="whether to delete the original files after conversion")
+    
+    # Parse the command-line arguments
+    args = parser.parse_args()
+    
+    main(directory=args.directory, in_ext=args.in_ext, quality=args.quality, delete_originals=args.delete_originals)
--- a/tools/convert_images_to_webp.py
+++ b/tools/convert_images_to_webp.py
@ -1,56 +1,70 @@
 import argparse
-import glob
-import os
 from pathlib import Path
+import os
 from PIL import Image

-
+def writable_dir(target_path):
+    """ Check if a path is a valid directory and that it can be written to. """
+    path = Path(target_path)
+    if path.is_dir():
+        if os.access(path, os.W_OK):
+            return path
+        else:
+            raise argparse.ArgumentTypeError(f"Directory '{path}' is not writable.")
+    else:
+        raise argparse.ArgumentTypeError(f"Directory '{path}' does not exist.")
+    
 def main():
    # Define the command-line arguments
    parser = argparse.ArgumentParser()
-    parser.add_argument("directory", type=str,
+    parser.add_argument("directory", type=writable_dir,
                        help="the directory containing the images to be converted")
    parser.add_argument("--in_ext", type=str, default="webp",
                        help="the input file extension")
+    parser.add_argument("--out_ext", type=str, default="webp",
+                        help="the output file extension")
    parser.add_argument("--delete_originals", action="store_true",
                        help="whether to delete the original files after conversion")

    # Parse the command-line arguments
    args = parser.parse_args()
-    directory = args.directory
+    directory = Path(args.directory)
    in_ext = args.in_ext
    delete_originals = args.delete_originals

-    # Set the output file extension to .webp
-    out_ext = "webp"
-
    # Create the file pattern string using the input file extension
    file_pattern = f"*.{in_ext}"

    # Get the list of files in the directory that match the file pattern
-    files = glob.glob(os.path.join(directory, file_pattern))
+    files = list(directory.glob(file_pattern))

    # Iterate over the list of files
    for file in files:
-        # Open the image file
-        img = Image.open(file)
+        try:
+            # Open the image file
+            img = Image.open(file)

-        # Create a new file path with the output file extension
-        new_path = Path(file).with_suffix(f".{out_ext}")
-        print(new_path)
+            # Create a new file path with the output file extension
+            new_path = file.with_suffix(f".{args.out_ext}")
+            print(new_path)

-        # Check if the output file already exists
-        if new_path.exists():
-            # Skip the conversion if the output file already exists
-            print(f"Skipping {file} because {new_path} already exists")
-            continue
+            # Check if the output file already exists
+            if new_path.exists():
+                # Skip the conversion if the output file already exists
+                print(f"Skipping {file} because {new_path} already exists")
+                continue

-        # Save the image to the new file as lossless
-        img.save(new_path, lossless=True)
+            # Save the image to the new file as lossless
+            img.save(new_path, lossless=True)

-        # Optionally, delete the original file
-        if delete_originals:
-            os.remove(file)
+            # Close the image file
+            img.close()
+
+            # Optionally, delete the original file
+            if delete_originals:
+                file.unlink()
+        except Exception as e:
+            print(f"Error processing {file}: {e}")


 if __name__ == "__main__":
--- a/tools/crop_images_to_n_buckets.py
+++ b/tools/crop_images_to_n_buckets.py
@ -10,11 +10,25 @@ import argparse
 import shutil

 def aspect_ratio(img_path):
-    """Return aspect ratio of an image"""
-    image = cv2.imread(img_path)
-    height, width = image.shape[:2]
-    aspect_ratio = float(width) / float(height)
-    return aspect_ratio
+    """
+    Calculate and return the aspect ratio of an image.
+    
+    Parameters:
+    img_path: A string representing the path to the input image.
+    
+    Returns:
+    float: Aspect ratio of the input image, defined as width / height.
+           Returns None if the image cannot be read.
+    """
+    try:
+        image = cv2.imread(img_path)
+        if image is None:
+            raise ValueError("Image not found or could not be read.")
+        height, width = image.shape[:2]
+        return float(width) / float(height)
+    except Exception as e:
+        print(f"Error: {e}")
+        return None

 def sort_images_by_aspect_ratio(path):
    """Sort all images in a folder by aspect ratio"""
@ -29,7 +43,26 @@ def sort_images_by_aspect_ratio(path):
    return sorted_images

 def create_groups(sorted_images, n_groups):
-    """Create n groups from sorted list of images"""
+    """
+    Create groups of images from a sorted list of images.
+
+    This function takes a sorted list of images and a group size as input, and returns a list of groups,
+    where each group contains a specified number of images.
+
+    Parameters:
+    sorted_images (list of tuples): A list of tuples, where each tuple contains the path to an image and its aspect ratio.
+    n_groups (int): The number of images to include in each group.
+
+    Returns:
+    list of lists: A list of groups, where each group is a list of tuples representing the images in the group.
+
+    Raises:
+    ValueError: If the group size is not a positive integer or if the group size is greater than the number of images.
+    """
+    if not isinstance(n_groups, int) or n_groups <= 0:
+        raise ValueError("Error: n_groups must be a positive integer.")
+    if n_groups > len(sorted_images):
+        raise ValueError("Error: n_groups must be less than or equal to the number of images.")
    n = len(sorted_images)
    size = n // n_groups
    groups = [sorted_images[i * size : (i + 1) * size] for i in range(n_groups - 1)]
@ -37,11 +70,30 @@ def create_groups(sorted_images, n_groups):
    return groups

 def average_aspect_ratio(group):
-    """Calculate average aspect ratio for a group"""
-    aspect_ratios = [aspect_ratio for _, aspect_ratio in group]
-    avg_aspect_ratio = sum(aspect_ratios) / len(aspect_ratios)
-    print(f"Average aspect ratio for group: {avg_aspect_ratio}")
-    return avg_aspect_ratio
+    """
+    Calculate the average aspect ratio for a given group of images.
+
+    Parameters:
+    group (list of tuples):, A list of tuples, where each tuple contains the path to an image and its aspect ratio.
+
+    Returns:
+    float: The average aspect ratio of the images in the group.
+    """
+    if not group:
+        print("Error: The group is empty")
+        return None
+    
+    try:
+        aspect_ratios = [aspect_ratio for _, aspect_ratio in group]
+        avg_aspect_ratio = sum(aspect_ratios) / len(aspect_ratios)
+        print(f"Average aspect ratio for group: {avg_aspect_ratio}")
+        return avg_aspect_ratio
+    except TypeError:
+        print("Error: Check the structure of the input group elements. They should be tuples of (image_path, aspect_ratio).")
+        return None
+    except Exception as e:
+        print(f"Error: {e}")
+        return None

 def center_crop_image(image, target_aspect_ratio):
    """Crop the input image to the target aspect ratio.
@ -54,20 +106,33 @@ def center_crop_image(image, target_aspect_ratio):

    Returns:
        A numpy array representing the cropped image.
+        
+    Raises:
+        ValueError: If the input image is not a valid numpy array with at least two dimensions or if the calculated new width or height is zero.

    """
+    # Check if the input image is a valid numpy array with at least two dimensions
+    if not isinstance(image, np.ndarray) or image.ndim < 2:
+        raise ValueError("Input image must be a valid numpy array with at least two dimensions.")
+
    height, width = image.shape[:2]
    current_aspect_ratio = float(width) / float(height)

+    # If the current aspect ratio is already equal to the target aspect ratio, return the image as is
    if current_aspect_ratio == target_aspect_ratio:
        return image

+    # Calculate the new width and height based on the target aspect ratio
    if current_aspect_ratio > target_aspect_ratio:
        new_width = int(target_aspect_ratio * height)
+        if new_width == 0:
+            raise ValueError("Calculated new width is zero. Please check the input image and target aspect ratio.")
        x_start = (width - new_width) // 2
        cropped_image = image[:, x_start:x_start+new_width]
    else:
        new_height = int(width / target_aspect_ratio)
+        if new_height == 0:
+            raise ValueError("Calculated new height is zero. Please check the input image and target aspect ratio.")
        y_start = (height - new_height) // 2
        cropped_image = image[y_start:y_start+new_height, :]

@ -77,8 +142,10 @@ def copy_related_files(img_path, save_path):
    """
    Copy all files in the same directory as the input image that have the same base name as the input image to the
    output directory with the corresponding new filename.
-    :param img_path: Path to the input image.
-    :param save_path: Path to the output image.
+
+    Args:
+        img_path (str): Path to the input image file.
+        save_path: Path to the output directory where the files should be copied with a new name.
    """
    # Get the base filename and directory
    img_dir, img_basename = os.path.split(img_path)