Welcome to the staging ground for new communities! Each proposal has a description in the "Descriptions" category and a body of questions and answers in "Incubator Q&A". You can ask questions (and get answers, we hope!) right away, and start new proposals.
Are you here to participate in a specific proposal? Click on the proposal tag (with the dark outline) to see only posts about that proposal and not all of the others that are in progress. Tags are at the bottom of each post.
Comments on What is the technical term for converting a sound recording to a phoneme vector?
Post
What is the technical term for converting a sound recording to a phoneme vector? Question
Many natural language processing models begin by taking text and converting it to a vector where each element is a number representing some semantic entity (I would say each number is a word, but a token is not necessarily a word). This is sometimes called tokenization.
If you wanted to take a raw sound recording of someone speaking a specified known language (say you know a priori the person is speaking correct English) and you want to extract a a vector where each element is a number representing a phonemic (or perhaps phonetic is a better term) token, what would this process be called?
I know that there are many voice recognition models, but these usually go the whole way of converting sound to text, losing most non-semantic properties (accents, inflection) in the process. I would like to know the name of the process up until the filtering out of the non-semantics.
1 comment thread