Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing DAR + a few additions #28

Open
UpscaleAnon opened this issue Aug 27, 2024 · 4 comments
Open

Fixing DAR + a few additions #28

UpscaleAnon opened this issue Aug 27, 2024 · 4 comments

Comments

@UpscaleAnon
Copy link

UpscaleAnon commented Aug 27, 2024

There is a really big drawback of this setup, it does not respect DAR (Display Aspect Ratio), so videos that are supposed to display as 4:3, but is for example 720x480, they get treated as a 720x480 video when they should be 640x480.
This could simply be solved if they let you set the width, instead of only the height before upscaling and possibly after upscaling.
Possibly the same issue as #9
Would be nice with more options in general, like an option to discard subs or pick which audio stream you want, or all audio streams and to remove the embedded TTF files so that you can use mp4 instead of mkv container.
Aswell as support for SVT-AV1 2.1.0 though I guess I can just edit the ffmpeg output settings manually. Haven't really tried that yet.
Maybe even support for models like AnimeVideoV3 that uses .bin and .param files? Because I prefer ESRGAN's AnimeVideoV3 over AnimeJaNai as it cleans up the image a lot more, but there is no 2x model for the pth version.
However, the AnimeVideoV3 4x pth model claims that you can use it at 2x by using the parameters -s 2, but VideoJaNai doesn't let me do that. So maybe implement a way to tell the model what scale it should use?
Maybe also an option to alter the framerate.
Great software though, being able to upscale directly into a video, rather than extract frames, upscale then encode back into a video. Saves a LOT of time, especially with using TensorRT ontop of it.
A few changes and it would be absolutely perfect.

@UpscaleAnon
Copy link
Author

Did some digging through the code, specifically animejanai_core.py
Is this program really set so that it does only 2x upscale of everything? Even if you use a 4x model?
In terms of speed it doesn't feel like that I think, but I see functions called things like "upscale2x".
Though I don't even know if ESRGAN would run faster if you could somehow pass the -s 2 argument to it.
And then in Resources.Designer.cs there's this:

Looks up a localized string similar to Generate a dynamic engine which supports resolutions up to 1920x1080. Works with Compact, SPAN, 4x ESRGAN, etc..

So does the program only upscale to 1080p, despite it claiming to work with 4x ESRGAN? Or am I just misunderstanding that part?

@the-database
Copy link
Owner

Thanks for the feedback.

onnx upscale models are typically only a single scale, and VideoJaNai runs the model at that scale. The "upscale2x" function name is just an old function name, the original version of this code was written to work with the 2x models that I trained before I updated to code to work on other models.

I'm not familiar with the -s 2 parameter for for AnimeVideoV3. If you share some details on the exact model and program you are referring to, I can see exactly what they do in this scenario and see if VideoJaNai supports something similar already or if it could be added.

The note about 1080p is talking about TensorRT engine generation. TensorRT engines may be static (they work on a single input resolution) or they may be dynamic (they have a minimum and maximum resolution and work on input resolutions within that range). The note says that when dynamic engines are generated, they have a max input resolution of 1080p. If you upscale a video larger than 1080p then a static engine would be generated and used instead.

@UpscaleAnon
Copy link
Author

The -s 2 parameters for AnimeVideoV3 is used through it's original bin/param models through NCNN: https://github.com/xinntao/Real-ESRGAN
There's a 0.2.5.0 Windows build that has an EXE called realesrgan-ncnn-vulkan.exe and you just pass the -s 2 argument to that, assuming the model used is meant for that scale.
Though bin/param can't be turned to ONNX for VideoJaNai as far as I know, but Xinntao has a PTH version of the model: https://openmodeldb.info/models/4x-realesr-animevideo-v3
And according to the github, it claims it can be used for 1x, 2x, 3x and 4x: https://github.com/xinntao/Real-ESRGAN/blob/master/docs/anime_video_model.md
But I have no idea how that works.
Sorry if I'm not much of help.

@UpscaleAnon
Copy link
Author

Aswell as support for SVT-AV1 2.1.0 though I guess I can just edit the ffmpeg output settings manually. Haven't really tried that yet.

Just tried this but it doesn't work unfortunately.

[vost#0:0 @ 00000246fde5ec80] Unknown encoder 'libsvtav1'
[vost#0:0 @ 00000246fde5ec80] Error selecting an encoder
Error opening output files: Encoder not found

It does however seem to support libaom-av1, but that's quite useless due to the low speed, and it performs best using 2 pass anyways, which I don't think would be possible in VideoJaNai.
Though I found an easy workaround, to just replace ffmpeg.exe in C:\Users\Anon\AppData\Roaming\VideoJaNai\ffmpeg
with a more up to date build that includes SVT-AV1, and it seems to work just fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@the-database @UpscaleAnon and others