Community Proposals

Many natural language processing models begin by taking text and converting it to a vector where each element is a number representing some semantic entity (I would say each number is a word, but a token is not necessarily a word). This is sometimes called tokenization.

If you wanted to take a raw sound recording of someone speaking a specified known language (say you know a priori the person is speaking correct English) and you want to extract a a vector where each element is a number representing a phonemic (or perhaps phonetic is a better term) token, what would this process be called?

I know that there are many voice recognition models, but these usually go the whole way of converting sound to text, losing most non-semantic properties (accents, inflection) in the process. I would like to know the name of the process up until the filtering out of the non-semantics.

is a duplicate

This question has been asked before and has already been answered. It should be marked as a duplicate.

Please enter the URL of the proposed duplicate in the details field below.

not constructive

This question cannot be answered in a way that is helpful to anyone. It's not possible to learn something from possible answers, except for the solution for the specific problem of the asker.

community not specified

This question is missing a tag for an active community proposal. If the proposal this question is part of exists, please add the tag and the question can be reopened. If the proposal does not yet exist, please start an entry in the Descriptions category to describe the intended community. Descriptions can be fleshed out over time.

Communities

Comments on What is the technical term for converting a sound recording to a phoneme vector?

What is the technical term for converting a sound recording to a phoneme vector? Question

1 comment thread