Published: 02.07.2022
Natural Language Processing (NLP) is used to refer to everything from spee recognition to language generation, ea requiring different teniques. Some of the important teniques will be exained below, i, what is ai techniques in artificial intelligence.e. part-of-spee tagging, named entity recognition, and parsing.
Let's examine the sentence "John hit the can." One of the first steps in NLP is lexical analysis, using a tenique called Part of Spee Tagging (PSD). With this tenique ea word is labeled to correspond to a category of words with similar grammatical properties, based on its relationship to adjacent and related words. Not only words are tagged, but also paragraphs and sentences. Part-of-spee tagging is mostly done with statistical models, whi lead to probabilistic results rather than hard rules, and is therefore used to process unknown text. Also, they can deal with the possibility of multie possible answers instead of a single tenique
One tenique that is often used for labeling is a hidden Markov model (MOM). A MOM is similar to the Markov Decision Process, where ea state is part of spee and the result of the process is the words of the sentence. MOMs 'remember' the sequences of words that came before. Based on this, they can make better estimates of what part of the word is.
For exame: 'can' in 'la lata' is more likely to be a noun than a verb. The end result is that the words are labeled as follows: 'John' as a noun (S), 'hit' as a verb (V), 'la' as a determiner (D) and 'lata' as a noun (N) too.
Named Entity Recognition, or REN, is similar to EPD tagging. Instead of tagging words with the word-in-sentence function (WPS), words are tagged with the type of entity that the word represents. These entities can be, for exame, peoe, companies, time or location. But also more specialized entities su as the gene, or the protein. Although a MOM can also be used for REN, the tenique of oice is a recurrent neural network (RNN). An RNN is a different type of neural network as discussed above, but it takes sequences as input (a number of words in a sentence, or whole sentences), and remembers the output of the previous sentence (8). In the sentence we're looking at, you'll recognize John as the entity 'person'.
A final tenique to discuss is called Parsing (Syntactic Parsing) - analyzing the grammar of the text and the way words are arranged so that the relationship between words is clear. The Part-of-Spee tag of lexical analysis is used and then grouped into short sentences, whi in turn can also be combined with other sentences or words to make a slightly longer sentence. This is repeated until the goal is reaed: every word of the sentence has been used. The rules for how words can be grouped is called grammar and can take a form like this: D+S - NP, whi reads: a determiner + noun - Noun Phrase. The final result is shown in the figure.