- Add GLM-Image (zai-org/GLM-Image) model detection and loading
- Custom pipeline loader with proper component handling:
- ByT5 text encoder (cannot use shared T5 due to different hidden size)
- Vision-language encoder (9B AR model)
- DiT transformer (7B)
- Fix EOS token early stopping in AR generation
- Add AR token generation progress tracking with terminal progress bar
- Fix uninitialized audio variable in processing
- Add TAESD support for GLM-Image (using f1 variant)