|Nils 6084fa12d8 readme||1 month ago|
|example||1 month ago|
|wip/wavyjs||1 month ago|
|LICENSE||1 month ago|
|README.md||1 month ago|
|example.ini||1 month ago|
|generator.py||1 month ago|
|template.html||1 month ago|
|videomixer.js||1 month ago|
A web based video player for files with multichannel audio where the user can adjust the volume levels of individual instruments and follow along a visual presentation (notation, conductor video etc.).
View the example:
Works in every browser. No obscure technology is involved.
You may be the songwriter of a band or the conductor of a choir. This player exists to share your own music pre-productions with your fellow musicians so that they can practice.
They should be able to control the volume levels (-> mix) of individual instruments, or tracks. Also they must be able to follow the music visually. Primarily intended usecase is notation, of course in sync to the music. Or you record yourself conducting. It also could be just a few hand written instructions. For more ideas see the next chapter.
The multi-track approach also enables you to provide “optional” tracks to the musicians, such as metronome click or spoken instructions (“Solopart in 4, 3, 2, 1 NOW!”) because they can easily be switched on and off.
A video with multichannel audio offers maximum flexibility and quality. You can produce whatever you want.
The web player itself is based on the standard WebAudio API that is available in any browser since at least 2015 and considered robust and tested technology.
But complexity has to go somewhere: All work is done during the video production process, which is a bit more involved and time consuming than simply pressing the export button in a DAW.
So, what benefit does the video player method offer? Here are a few video ideas. Some of them follow the main idea to offer a way for musicians to practice alongside a “playback”-video, some are eyecandy or extras for an interesting video. Maybe a special video for your most dedicated fans, with “band commentary” etc.
Shortcut: Use the provided example website, copy and modify.
Each video needs its own sub-website. One .html page per project.
Copy the file
videoplayer.js to your own webserver. You only need this once in a known location, no
You need to load the js file in your html. At the bottom(!) of your file ( after
</html> ) insert:
Order is important. Videomixer.js can only be loaded after you setup the following in the html body. Note: This work is done automatically by the supplied generator program. See the chapter below. But here is the manual way, which also explains what is going on:
Explanation of the code above:
Your video files audio channels are sorted in stereo pairs, left and right channel (by design
requirement of this tool.) The array
tracknames gives each stereo pair a label. These will be the
names of the mixer strips. We call this a “Track”. Each mixer strip (up to 9) automatically
gets assigned a number keyboard-shortcut to quickly toggle volume between 0 and 1 (off and on). This
also works when the video is in fullscreen mode.
volumeMap has default volume values for each of the tracks, in the same order as they
trackNames. These are the starting values for each mixer strip. 0.0 is off, 1.0 is the
volume level of the original audio/video and you can go up to 2.0 for software amplification (may
sound bad). This was primarily designed to mute certain tracks, like metronome click or vocal
conductor instructions, from the start and therefore make them “optional”.
You need to manually give your audios sample rate in
videoAudioSampleRate. Common values are
44100 Hz and 48000 Hz. This is not a choice but you simply need to match the value your
audio/video actually already has. Since you created the video yourself you most likely already know
the value. If not use any tool such as VLC, (S)Mplayer or commandline tools like
ffmpeg) to determine it.
The HTML5 player needs the source id
videofilename. The actual filename and media-type
need of course to be filled in by you. You can place the video player where it fits your HTML
where to display the mixerstrips. You can move that around in your HTML to fit your design.
As the provided example files show there are numerous buttons and keyboard-shortcuts such as “mute all”.
You can optionally use them in your HTML. The keyboard shortcuts are hardcoded in
need to be changed there. This way they also work when the video is in fullscreen mode.
[S] setAllVolumeToZero() // all tracks off. Silence. [R] resetAllVolumeToDefault() // reset. all tracks to the values of array "volumeMap" [H] setAllVolumeToOne() // all tracks on. Hear all. [Space] playPause() // toggle playing the video, even if the video player is not in input focus. [All four arrow keys] seek(SECONDS) // seek SECONDS forward or backward (negative number), even if player is not in focus. [W] normalPlaybackSpeed() //normal playback speed. [D] fasterPlaybackSpeed() //increase playback speed (may sound bad) [A] slowerPlaybackSpeed() //decrease playback speed (may sound bad)
If seeking and faster playback does not work you should suspect your webserver. In this case you will be also unable to skip around in the videos timeline with your mouse. Your webserver does not support this kind of video streaming.
The hardest part of all this is to actually create the video.
The (so far) best file format is mp4 with opus audio (mp4 has problems with .flac audio, .mkv can do flac but Firefox does not support .mkv, ffmpeg itself has problems with multichannel .aac and .wav).
Each instrument (or track) has two audio channels. The first two channels are track1, the next two track2 and so on. When producing your audio write down your track order because you need to write that manually into your HTML file, or the generator .ini.
To get started let’s concentrate on the audio part and produce a test video.
Create a multichannel audio opus. For example Ardour can export multichannel
.wav, so can
jack_capture. Convert that to opus with
opusenc export.wav multi.opus.
Write down how long your music is in seconds.
Now generate a video of the same length (or a second longer, it doesn’t need to be frame-accurate). The next command generates a video of 1:32 (92s) length with resolution of 1920x1080 and 30 frames per second.
ffmpeg -f lavfi -i testsrc=duration=92:size=1920x1080:rate=30 testsrc.mp4
Now merge your audio into the test video. This simply merges (
-copy) the two files, it does not render
anything and should be done in a few moments. The overall duration of the video is the shorter of the
two files (
ffmpeg -i testsrc.mp4 -i multi.opus -map 0 -map 1 -c copy -shortest testcombined.mp4
The video is now ready to be used by the player, as described in this README.
This setup is quite involved. It assumes that you are familiar with the linux command line and are able to analyze the given commands so that you can adjust your file-paths and names yourself.
Nothing prevents you from using a non-linear video editor that can export multichannel audio, and do everything by hand. However, I would like to have a more automated approach. I’ll now describe my own process, which is hardly the best one. But it works for now. This process could be made more easier by scripting some steps, but it will still remain a bit of work.
The process is only so complicated and verbose because recording multichannel video via JACK is largely unexplored. Read the roadmap at the end of the file to see how that could be made easier in the future.
However, keep in mind that all this complexity has nothing to do with your musicians watching your video. You do the work so they don’t have to.
This is on Linux with “simple screen recorder” and “jack_capture”. You also need “sox” for one step and compile an extra program from github (see below). We assume that we want to record music with four tracks (8 channels).
jack_capture --channels 8 --jack-transport --manual-connections
jack_capture --channels 2 --jack-transport --manual-connections
Those two wait until jack transport starts rolling and will record until transport stops.
Setup, through the GUI, that you want to record mp4 video with H.264 Codec. For Audio choose JACK. Don’t use the options to “Record System Microphone” or “Record System Speakers”. For audio recording codec (yes, we need that as well) choose Vorbis with 192 bitrate. This internal audio recording will later be deleted, but we need it to sync with the multi channel audio.
The rest of the settings is up to you, for example to show the mouse cursor, what portion of the screen to record and in which resolution etc. Configure a shortcut to start/stop the recording. I use Ctrl+Shift+R.
You should also use your desktop environment to setup a global keyboard shortcut to start and stop jack transport. Bind these two commands to keys of your choice:
bash -c 'echo play | jack_transport'
bash -c 'echo stop | jack_transport'
The order in which you connect to your input ports will determine the track order in the video player. First two input ports are your first instrument etc. Write down that order as tracks, for example “voice, guitar, bass, drums”.
Before we start the actual recording, a short explanation what is going on here.
Both jack_capture instances will record different audio (multichannel vs. stereo mixdown) but they will be perfectly in sync, with frame-accurate mathematical precision, thanks to jack-transport.
SimpleScreenRecorders internal audio record will contain the same audio as the stereo jack_capture, but they will not(!) be in sync because we start video recording manually and audio recording at a different time.
We will later compare both stereo recording (automatically) to figure out the offset between the video and the pure audio recordings and then use that offset to integrate the multichannel recording into the video.
Instead of jack_capture you could use a DAW export, for example Ardour. But then you need to make sure that also the stereo export is done by Ardour, so those two are perfectly synced.
All three recording programs are now waiting to be started. And we can start them all with keyboard shortcuts.
Setup your recording area so it shows what you want to record (e.g. PDF reader or a live view of your webcam for conducting)
Now in this order:
Go to the SSR GUI and finalize the recording. Save the file. jack_capture will have produced two files. One with .wav extension for the stereo version and one with .wavex extension for the multichannel mix.
Your video file is now longer than the audio recording. It started earlier and stopped later (even if only by half a second)
Extract the video-internal audio:
ffmpeg -i ssr_capture.mp4 -map 0:a -c copy video_audio.ogg
(This assumes you previously chose Vorbis as SSR setting.)
Download this program: https://github.com/alopatindev/sync-audio-tracks and
make it in place. We
only need the script
Run (with your own file names and paths of course):
./compute-sound-offset.sh /home/user/jack_capture_stereo.wav /home/user/video_audio.ogg 0
0 is a time limit for the tools analyzer, where zero is no limit. I never found any reason to actually set that.
You will receive a number now, which is time in seconds. If you used shortcuts to start your recordings
this may be under 1 second. In my case it took 0.8s. Copy this number:
We now know how much the video started earlier than the audio recording. We need to either trim the videos beginning or add silence to the multi-channel audio. Video trimming is easier. In the same step we delete the original video-audio from SSR. Insert your offset here:
ffmpeg -ss 0.85212500000000002 -i ssr_capture.mp4 -c:v copy -an video_cut_no_audio.mp4
To check if the calculated sync was correct we can merge our jack_capture stereo recording with the video. Then you can watch the video in any standard player (vlc, smplayer) without having to setup them for multichannel playback.
First convert our stereo capture to ogg vorbis (for simplicity), then merge. Then watch the video
sox jack_capture_stereo.wav replacement_stereo_audio.ogg ffmpeg -i video_cut_no_audio.mp4 -i replacement_stereo_audio.ogg -c copy -map 0:v:0 -map 1:a:0 video_cut_with_test_audio.mp4 smplayer video_cut_with_audio.mp4
Everything should just look fine, in fact it should (to the human eye and ear) be no different than the original SSR screen capture, except the video starting 0.85 seconds earlier.
The final step is a quick one.
Convert the multi channel jack capture to opus
opusenc jack_capture_multi.wavex multi.opus
and merge it with our video file that had its audio removed.
ffmpeg -i video_cut_no_audio.mp4 -i multi.opus -map 0 -map 1 -c copy -shortest /home/user/multichannelwebsite/mysong.mp4
This simply merges (
-copy) the two files, it does not render anything and should be done in a few
moments. The overall duration of the video is the shorter of the two files (
-shortest). Our video
was a few moments longer because we stopped its recording last. This will trim the ending to the
The file mysong.mp4 is used by the website.
In this repository is a python script called
generate.py which takes an
.ini as input argument
and outputs a html website to standard output. You can save the output in any file you like. In the
example we use index.html so we can point our test-webserver directly to the directory, but this
is just for development reasons.
./generate.py example.ini > example/index.html
The file example.ini is self-explanatory and you can copy it to adapt it to your own files.
As written above you really need just the file videomixer.js and certain sections in your HTML file. There is no important code in any .css file.
The files in the
wavyjs by Chris Schalick (MIT). This is currently not in use. It is intended to provide .wav generation
and download in the future. You can see some commented-out sections of code in
videoplayer.js to download your current mix. However, the WebAudio API side of that is not yet working.
We want the user to be able to download the current volume mix as .wav file so they can practice where they want without the video player website. This is work in progress.
Some small bugs could further simplify the code. For example figuring why the WebAudio API fails to correctly recognize the videos sample rate.
The video production process needs streamlining. My own workflow will forever be based on JACK, but it should be easier to record a video directly, without replacing the audio in post processing. Apparently OpenBroadCast Studio has a (complicated) way to offer multichannel JACK recordings, but even if that works properly, the OBS setup also has some complexity.
simplescreenrecorder, my screen capture tool of choice only supports stereo audio.
Making the channel count a user setting would solve everything, but SSR is really tied to stereo audio,
so much that is already has comments about this in their code.
Multiple Video Tracks. Record a video of each instrument, or choir-section. The users can switch between multiple “cameras” themselves.
Utilizing the subtitle and image embedding functionality of the mp4 video container format.
Multi video, subtitles and images are completely untested. It could already work since everything is based on a standard HTMl5 video player.