Animation Audio: Development Test (Linux AppImage)

emmetpdx · August 3, 2022, 11:32pm

Hey again Krita animators!

For a number of months @eoinoneill and I have been working on and off on a major rewrite of Krita’s animation systems with the goal of improving the way that Krita handles animation-audio synchronization. The goal here is broad, as we want to do as much as possible to improve not only the quality and consistency of animation audio in Krita, but also to do whatever we can to improve the animation workflow for the common tasks of animating to dialogue, music, and other types of sound.

If you’re interested in the details or more closely following the development of this feature, you can of course check out the gitlab merge request here.

Well… While we aren’t quite there yet and are now targeting v5.2 (instead of the upcoming v5.1), we’ve decided to stop by and share an early preview with our Krita animator community here on KA, in the form of a Linux test AppImage which you can download and try out here:

Krita - Animation Audio Test AppImage #1 (Linux)

(Obviously this comes with the usual warnings and caveats about running unofficial development builds from unusual sources. Expect issues, bugs and crashes; don’t do any important work using unstable software; be careful that you’re running things from trusted sources; etc.)

(Also, sorry Windows and Mac users, this test build is only available as a Linux AppImage.)

Here’s what you should expect…

What should work solidly at this point & what is worth testing

Animators should be able to load up a compatible audio file and animate with it as expected.
Animation playback should work as animators expect, with a consistent synchronization between image frames and sound across a variety of playback conditions.
Scrubbing and clicking around on the timeline should produce small chunks of audio that are easy to discern. Right now we’re pushing a 0.25sec window, which is larger than a single animation frame, because we think it’s a bit easier to work with a bit of extra context. As such, you’ll hear some overlap between frames, but this is by design for now and open to debate. (Maybe something that should be configurable… Not sure, so we’re looking forward to feedback on this point.)
Rendering an animation to a file should work as expected in all cases. When rendering an animation with attached audio, the synchronization of audio and video should be perceptibly consistent with the playback within Krita itself. In other words, audio and video should synchronize as you’d expect in both the program and the rendered video file.
Obviously everything should be relatively stable and functional. But this is still a development build, so it’s possible that bugs will be present. If you come across bugs, this merge request is the best place to report them for the time being.

Known issues & things that are not yet where they need to be

Playback speed adjustment is non-functional for now and has been disabled. This is still on our to-do list as described above and I think we good idea of how to go about it under MLT, but we’re just not there yet.
The “drop frames” button has also been disabled for now, and we’re not sure what exactly to do about that yet. In the case of audio-synchronization, the images kind of have to go along with the precise timing of the audio track. This is still an open question, but what’s most important for now is that audio/video synchronization is stable.
We’d like to implement an audio waveform visualization track on the timeline docker to ship alongside the audio update, as it’s a great way to help placing frames in time to music and dialog, but we’re not there yet.

Where do you come in?

If you’re interested in seeing what’s new with animation audio in Krita, just give it a try and let us know what you think!

We’re really looking for all kinds of feedback and opinions about how you feel the new audio system is working, whether it seems to be an improvement over the old one so far, how well you feel that it synchronizes video and audio both inside Krita and in rendered video files, how easy it is to work with, etc.

Just go ahead and drop any thoughts you have related to audio synchronization and workflow here in this thread. The feedback that we get here will be a great way to inform our next steps when it comes to animation audio in Krita, so don’t be shy and we can hopefully get a conversation going that will lead to making Krita 5.2 a great release for animators.

P.S.: Eoin and I are going to be away for a little while for a summer break soon, so don’t worry if we go radio silent for a bit. We’ll be back!

AhabGreybeard · August 4, 2022, 6:28am

Hi @emmetpdx and @eoinoneill and thank you for the effort of all this

I’ve quickly tried the test appimage in a very small way with a short test animation I made some time ago and an associated test audio file.
I seem to be one of the lucky people who have no problem with audio synchronisation even though I use a creaky eleven year old desktop computer.

The only thing I’ve noticed so far, with limited and simple testing, is with slow scrubbing where the sound happens slightly earlier than it does with the 5.1.0-beta2.
That may be related to the 0.25 sec window you mentioned.
However, that doesn’t seem to affect my perception of synchronisation at full speed 24fps playback.
I used a simple test animation and a simple audio track so something like lip-voice sync may give a different impression.

If anyone wants to try my test files, here they are (1.7MB download):
Ahab-Tick-Tock-Test
I should try to make something more complicated and sophisticated one day.

They’re simple and should be self explanatory. You can check the exact timing by using Audacity or a similar application.

Enjoy your summer break and have good weather

Ralek · August 4, 2022, 10:17am

Definitely looking forward to having more reliable and synchronized audio when working with lipsyncing, sounds like a great step in the right direction with the animation tools. Great work!

Unfortunately I’ll have to hold off on direct feedback or testing as I’m not currently running Linux on any of my animation work computers. But ill leave some generic feedback:

Scrubbing and clicking producing snippits of audio is great and will help with lipsyncing a lot. Not sure if it needs to be adjustable in playback length if you can find the sweet spot. Flash had a good snippet length back in those days but I’d have to go investigate to see how long those snippets actually were.

EyeOdin · August 4, 2022, 11:16am

I am a windows user so only read the text.

I would say the extra bit of audio give might be not worth it but maybe using makes sense but I am heavily in the doubt for now.

As for the waveform rendering Qt has a tutorial on how to render them on their webpage. I would check that out if you haven’t yet.

Other than that hope the vacations go well.

LunarKreatures · August 4, 2022, 11:52am

Thanks for the work but i think making the test build Linux only will reduce the amount of potential testers by a lot.

raghukamath · August 5, 2022, 12:48am

@emmetpdx can we get a test build for windows too?

halla · August 7, 2022, 10:40am

I don’t think the new dependencies build on Windows yet – they certainly aren’t in the binary factory, so a Windows build is unlikely to materialize any time soon. Plus, @emmetpdx and @eoinoneill are taking two weeks of vacation starting right now :-). (And given how much work this refactoring was, well deserved!)

emilm · August 7, 2022, 11:32pm

Gave the app image a spin, looks promising.
Synchronization looks to be good.

I think 0.25 is a bit too much. Hard to tell at which frame a beat starts at. (But like Ahab I didn’t test with a lip synch animation.)

Playback performance on high resolution files (4K UHD plus margins) is a bit worse on my machine.

Ahabs test file plays without problem.

emmetpdx · August 12, 2022, 2:40am

Hey all, thanks for the input so far. @eoinoneill and I are officially on vacation right now (and honestly it’s exactly what we both needed… lol), but I’m gonna swing by and respond to a few things.

@AhabGreybeard

I’ve quickly tried the test appimage in a very small way with a short test animation I made some time ago and an associated test audio file.

Thanks for testing and sharing the test file, Ahab.

The only thing I’ve noticed so far, with limited and simple testing, is with slow scrubbing where the sound happens slightly earlier than it does with the 5.1.0-beta2.

We basically push a chunk on sound at the same time as we ask the image to flip to a different frame, but of course there is threading involved at that point so things may happen at a slightly different time that’ll be variable based on the speed of your computer.

There may be ways to optimize the flipping of frames on the image side of things to make it faster in the future, but that’ll be something to look at later. It should be practically imperceptible though, so I’ll definitely do some extra testing to make sure that there’s nothing funky going on.

@Ralek

Definitely looking forward to having more reliable and synchronized audio when working with lipsyncing, sounds like a great step in the right direction with the animation tools. Great work!

Thanks Ralek! Yeah, I know how important sound can be for animators. There’s so much lipsyncing involved in just about any project. Also, I’m excited to try to animate/storyboard some scenes to music sometime.

Scrubbing and clicking producing snippits of audio is great and will help with lipsyncing a lot. Not sure if it needs to be adjustable in playback length if you can find the sweet spot. Flash had a good snippet length back in those days but I’d have to go investigate to see how long those snippets actually were.

We’ll probably embrace our inner KDE developers and make it configurable.

@EyeOdin

I would say the extra bit of audio give might be not worth it but maybe using makes sense but I am heavily in the doubt for now.

I understand this sentiment, because we all want to click on a frame and hear a preview of the contents of that frame only, right?

The issue is that in a 24fps animation, a single frame lasts ~42 milliseconds or ~0.04 seconds, and at 60fps the frame time goes down to 16ms. As the frame time gets smaller it becomes increasingly difficult to discern what you’re hearing (as you end up mostly just hearing only one small part of the envelope of any given sound).

For example, when I check the envelope of a relatively tight snare drum sound in my digital audio workstation, it’s about 250ms long. So if you’re trying to line up your animation with a drum loop (which is one of the tests that we’ve been doing.) at higher frame rates it becomes nearly impossible to tell where the actual drum hits are happening. I haven’t tried yet, I’d imagine a similar issue would happen for people lip-syncing dialogue.

At any rate, we still need to find a sensible default value and I think it makes a lot of sense to make this user configurable.

Unfortunately I’ll have to hold off on direct feedback or testing as I’m not currently running Linux on any of my animation work computers. @Ralek

Thanks for the work but i think making the test build Linux only will reduce the amount of potential testers by a lot. @LunarKreatures

can we get a test build for windows too? @raghukamath

Yeah… Sorry about that.

I’d like to have been able to create a test build for Windows as well, but because we do all of our development on Linux (and basically everything else too these days) we still need to figure out some of the building stuff for this branch on Windows. In fact, there are still some remaining questions about how we need to handle MLT (one of the dependencies) as well as our custom MLT plugin on Linux as well.

Also, a couple weeks ago we setup a little container image for Linux called Krita Devbox that makes it pretty easy for us to create a Linux AppImage format binary, whereas it’s a bit more work to package a Windows build for us right now. (An excuse plus a plug!)

Once we have all of the build stuff settled across platforms and we can get this branch merged into the Krita master branch, then it’ll be much easier to have a bigger, more inclusive test.

@emilm

Gave the app image a spin, looks promising.
Synchronization looks to be good.

I think 0.25 is a bit too much. Hard to tell at which frame a beat starts at. (But like Ahab I didn’t test with a lip synch animation.)

Playback performance on high resolution files (4K UHD plus margins) is a bit worse on my machine.

Thanks for testing, emilm!

I think you’re probably right about 0.25 being too much, I’ll test out some other values and see if we can find a more comfortable default. We’ll take a look at the performance too, it might be a good idea to try to take some benchmarks of some kind.

Thanks again so far everyone. Please feel free to come back and let us know anything else that comes to mind if you do more testing. This has been really useful to Eoin and I.

Ralek · August 12, 2022, 3:07am

I’ve never actually given storyboarding a try in Krita before, so that would be pretty interesting! I’m excited for the musical possibilities as well. I know I’ve been wanting to tackle Burnt Rice for a while.

Hey, I’m all for configuration options!

I agree, but the only reason I think it’s so important right now is because there isn’t an inbuilt waveform visualization. Knowing exactly which point you’re clicking on will be much easier when we have both the waveform and the audio feedback.

It’s no problem! This build was actually the final push needed for me to switch entirely to Linux myself. I unfortunately haven’t had a chance to test it out yet, but it should be easier in the future.

Thanks for the work and enjoy the rest of your vacation!

EyeOdin · August 12, 2022, 7:17am

I have animated lip sync before and yes it is tiny but it is what is expected for the given time.

AhabGreybeard · December 17, 2023, 10:33am

@isaac_oloketuyi_Flip This has already been asked for, in several topics, including a Feature Request that you created yourself that you have actually linked to.
Please stop bumping topics just because you want something.
It won’t make anything happen more quickly.