My Vocals + Ai

onlyinitforthemoney · October 16

I have an old vocal track

Is it possible to feed it to Ai and Ai spit it back with different lyrics?

I've done some research but i've seen none of them propose that feature

Wich one can do that?

Thank you and take care

treesha · October 16

Bob Doyle has some videos on YouTube about making a model of your own voice then using it in another way. It’s a few steps and uses a few programs if I recall. I think replay is one. I will add links later on. I messed with it a while back and want to try it again for times when allergies hit my voice bad. So it’s not simple but is possible to get your voice with new lyrics.

mettelus · October 16

This is probably coming soon, as most Voice AI is based on actual/adjusted sound models. The closest things I know of that exists now is the AI Vocal Modeling/Replacement that is available in DaVinci Resolve Studio 20, but that is designed for dialogue only (at the moment) and their are a few important pieces with it:

The model audio needs to be as dry as possible. Noise and FX (particularly reverb) should be removed to get the best result.
The model needs to enunciate properly. For dialogue this is not as often an issue, but some singers have a real challenge with this. Similarly, the audio to be replaced needs to be enunciated as well to replace properly.
The AI Modeler is slow, taking roughly 40 minutes to model 2 minutes of vocal. Because it is modeling a vocal, the pitch/key of that model is embedded into the model itself, so this needs to be taken into account for the replacement. Although there "are" tools in Resolve for this, Melodyne is a better choice for its precision capabilities.

I posted a little more detail in this post testing it with singing, but the bottom line with that is it is targeted for dialogue replacement (not generation from lyrics), and in order for it to accurately map "everything," the model must be exposed to everything (meaning all phonics, diphthongs, constants, vowels, etc.... VERY similar to how Realivox Blue was made). Right now, there is no way to "add to" an AI voice model (which would be super helpful), so if your sample only contains a subset of phonics, getting it to replace phonics it has never sampled may miss the mark.

I am assuming the OP is intending to use a "younger voice" (rather than current) to generate more/different lyrics here. Now that there are AI modeler(s) available, the ability to parse sung vocals is not that far away, BUT... their are very significant and serious copyright implications to such, since the modeler has no clue who the subject is... anyone who has had their singing voice recorded (even if long since dead) could be modeled. That aspect alone can be a big stumbling block for release, since there is no way to differentiate someone who wants to model their own voice versus someone that wants to model someone else's without permission. A way to ensure this for someone capturing their voice "right now" would be to have them interactively train the AI modeler very similar to the training that Dragon Naturally Speaking does for its voice-to-text functionality.

treesha · October 16

Since I was kinda vague about the process and videos I saw I asked ai to spell it out so here is what ai said and links to some videos.

🛠️ Step-by-Step Workflow

1. Record Clean Voice Samples ( or in my case use vocals that arent allergy effected )

Gather 3–10 minutes of your best singing or speaking voice

Use isolated vocals—no background music—for best results.

2. Install Replay and Pinokio

Replay is the free app that trains and converts your voice model.

Use Pinokio Launcher to install and run Replay easily.

Download Replay: tryreplay.io

Get Pinokio: pinokio.computer

3. Train Your Voice Model in Replay

Open Replay and go to the Training tab (GPU required).

Load your clean voice samples and follow the prompts to train your model.

https://www.youtube.com/watch?v=PYQnzIwa4mA

4. Convert Your Allergy-Affected Track

Import your allergy-affected vocal track into Replay.

Use your trained model to convert it into your clean voice.

5. Export and Replace in Cakewalk

Export the converted vocal as a WAV file.

Open Cakewalk and replace the original vocal track with the new one.

Align timing and apply any effects as needed.

100% Free Voice Cloning and Conversion with the Updated Replay Covers Replay setup, training your voice model, and converting vocals.

https://www.youtube.com/watch?v=PFJQSzoaDxI

Voice Cloning with RVC - Step by Step - Easiest method Deep dive into the technical process of training and converting vocals.

https://www.youtube.com/watch?v=PYQnzIwa4mA

RVC Tutorial 2025: The Easiest Vocal A.I. *UPDATE Updated guide for using RVC with Replay for best results .(not bob doyle)

https://www.youtube.com/watch?v=_V15jq9zSJw

onlyinitforthemoney · October 16

Wow i got more than i bargained for.

As i suspected, Ai is not yet ready for what i need and you're right there is " (...) very significant and serious copyright implications to such,"

but there is alternative solutions in your replies.

5 hours ago, mettelus said:

I am assuming the OP is intending to use a "younger voice" (rather than current)

Yes it is

I'll look into your suggestion

Thank you both

mettelus · October 16

I was checking through what Treesha had posted above and it seems the AI model cannot be appended there either (the scripting looks pretty identical to what Resolve has so it could be the same code). A way to bypass this (I would recommend what she posted above to build the AI model), is if you have multiple recordings of yourself to dovetail those vocal tracks end-to-end so you have a single (and possibly HUGE wav file)... then send that into the modeler (then walk away or take a nap while it processes). The more phonics you feed it, the better the model will be, and if it won't let you append a model, feed it everything you have in one go.

Bear in mind, this is also a replacement tool, so you can sing a track with your current voice and then apply the AI model to it. Again, Melodyne can assist greatly for both the current voice track (in my case, hitting high notes isn't what it used to be) before replacement, as well as post-production after replacement (in tests I ran the key was embedded into the AI model, so that may need polishing after replacement).

My Vocals + Ai

Recommended Posts

onlyinitforthemoney

treesha

mettelus

treesha

onlyinitforthemoney

mettelus

Please sign in to comment

Home

Forums

Cakewalk

Other Links