How is the pitch accuracy if I convert an audio file containing a vocal melody into a midi file?
In other words, when the conversion doesn’t happen in real time.
Is the performance any better than real time conversions?
Is it any better than competitor products like Melodyne etc, or do you guys essentially use the same technology when it comes to pitch accuracy?
Hey @oyvind. The algorithms we have implemented are aimed towards real-time use - so we don’t really have any way of processing offline persay.
If we did offline (non-realtime) conversion after the fact - yes we could achieve much better “performance” (in terms of what you might want as your midi output) - because you’re able to go back and forth in time and correct according to information that isn’t available to a real time method.
Non-realtime methods can give you a smoother result, compared to current real time stuff. The reason for this is not really increased “accuracy” - our pitch detector is very accurate, but perhaps the issue is it’s too accurate - when really what people want is for it to be less accurate (for example ignore transients at the start of vocal notes that cause ghost notes, or ignore vocal vibrato which causes unwanted note switching etc). Something non-real time can go back over the notes and smooth them out and effectively make it less accurate, but actually closer to what a producer might want - smoothed out MIDI notes.
We are currently working on some machine learning systems to solve these problems right now. It’s uncharted territory for real-time so it’s very much new and a research project. It will take some time, but it’s on the cards and currently in development.
We do recommend using our note restriction features (e.g. setting a key or using the note collector) to help. This is effectively what software like Melodyne also does - it can go back and try and match the input to a key and try and lock the MIDI notes to a key.
Hope that helps!
thank you for your detailed reply. It covered eveerything I wondered about.