Those who can’t, teach machine to do it

People seem skeptical of the processing of the data that I have made.

I’m fine with that.

Because actually, by just looking at the integer vector, I can’t really tell if it’s supposed to be a sound or someone has been playing a prank on me by replacing the meaningful waveform vectors with random vectors. If the data is raw or in another representation like wavelets or MFCC. That’s actually somehow interesting that we expect our machine learning algorithm to figure this out.

So, I’ve made a python script to check if the vectors made sense. I pick a random sentence in the training data, see its waveform, the corresponding phonemes and words and output a .WAV file. I also output the feature of the speaker so I can also check if the voice fits.

It’s supposed to say “Diane may splurge and buy a turquoise necklace”.

It does.

Also, reading the  script might help you understand how to use the .npy and .pkl files.

P.S: On an unrelated note, if I would expect words not to bring that much information over the phonemes, I would however consider the final punctuation to be obviously important in the prosody learning (assertion, question, exclamation…). So important that I haven’t included this feature yet…

P.P.S: Now if I wanted to transform the data into mainstream representations like Fourier transform or wavelets, I might want to try the signal processing and discrete Fourier transform scipy packages.

3 thoughts on “Those who can’t, teach machine to do it

  1. Nice post, thanks for doing that! Definitely it is easier to load the samples/features from npy files than from lots of different files. They definitely sound fine!

    I wrote a blog post about different speech signal representations. It includes an IPython notebook with examples to convert speech to LPC, STFT, and wavelet-based representations: http://www.seaandsailor.com/initial_representation.html

    By the way: there are two files on the server (speakers_ids.pkl and spkr_feature_names.pkl) which are readable only by your user. Could you change those permissions? Thanks!

Leave a comment