turns out ddim_eta is a global general setting param now, not passed directly as a p.ddim_eta any more. So i made it a schedule.
Todo: only show in ui if ddim is selected.
added consistency flow masks
- there is now an option to use flow consistency masks and an attached option for consistency mask blur, defaulted to 2.
- if you save extra frames, it also save consistency masks now
- you can see the effect on the flow in the flow outputs as well
- it doesn't work as great with cadence because you see afterimages, but if you up the blur it can be a little better.
fixed Frames to Video
- made the ffmpeg routine that Frames to Video uses able to take image files other than png. If png, it includes the -vcodec png as normal. But, if anything else it includes vcodec libx264, which works for jpgs. (jpgs don't work if using vcodec png, so I made it switchable). I haven't tested it with other filetypes, but I bet it works with others too. The png vcodec was specific to png.
- also added two more lines of instruction for how to use the file string.
- I also changed a few ransac functions for future use. They work as normal, but now have a switching behavior if passed depth. But, I'm not passing depth to them for now.
- a few minor code var name edits in hybrid video to align code better (mostly changed matrices to M, as is often convention
- commented a bunch of unused imports in render.py
- I'll leave it up to someone else to delete them after it's verified that everything works fine with them commented. I searched and didn't find them in that file. VSCode showed them as gray automatically, but I also verified.
Discovered that RAFT wasn't actually working due to an issue with the function that got the flows. There was a missing "elif". So, the RAFT flow would get calculated and stored in the variable 'r', but then 'r' would always be overwritten by the default Farneback at the end. We were fooling ourselves into thinking that was RAFT, when in actuality the RAFT flow is invalid and causes an error if actually used.
- Changed function call for flow methods so that this can never happen. Now, each case returns directly.
- Added to deprecation utils for now. We can remove the RAFT to Farneback conversion when we get RAFT working
Realignment of the way I handled frame indexes in motion routines to align everything more clearly
Major improvement to motion using prev_img during cadence!
- added a prev_img during cadence so that there is a prev_img to refer to for the flow
Fixed color matching issue with first frame on Image and Video Init modes
- first frame color match can't be done beforehand, so it's done afterwards. But, that normally makes for a very bad first frame. So, I added a redo for it to clean up the color matched image on first frame.
Major improvement to RANSAC
- switched to use SIFT for feature matching instead of Lucas-Kanade
- changed all border_mode to REFLECT_101, which matched how optical flow handled it, removed all the excess silly border_mode translations. This works much better.
I figured out a hack to make optical flow cadence work in 2D. To do optical flow cadence, I have to warp the flow field. But, the 2D animation warping function, usually used on images, would mess with the values of the flow (as if they were colors). So, I scaled them down by 1000 going in and scale them back up doing out, and it eliminates the effect it had which would make the image wobble around.
The same scaling actually messes with 3D optical flow cadence, so I leave that working at the normal scale factor.
I also made one modification to 3D optical flow cadence where it temporarily changes the sampling mehod (used by 3d warping function) to 'nearest' just for the flow warping, then restores it to it's previous value. This should help to minimize any pixel effects from warping.
After thorough testing of generation directly, I verified that it can handle tile sizes of 8. If this tile size is not met, animation does now work correctly because the images coming out of generation don't match the specified dimensions, making the prev_img wrong, which makes animation warping wrong.
Previous tile size of 64 was legacy, from notebook, and old auto1111 I believe. But, the pipeline can handle divisors of 8, verified. However, still not sure if the images produced are as good as when you use the 64 tile size, which is the size of the latent representation.
In any case, there was no limit on this before, and with these changes, it will assure that animation is always accurate. Suggest leaving slider at increments of 64. But, now if they select a dimension manually, it will be properly forced to tile size of 8, to ensure proper sizing through the engine.