History of NLP

As we mentioned previously, NLP is a subfield of AI that allows computers to understand, interpret, and translate human languages, as well as generate text. It aims to facilitate better communication between computers and humans by processing unstructured textual data.

Pulling from disciplines including computer science and computational linguistics, NLP minimizes the communication gap between humans and computers by allowing computers to read text, hear speech, and comprehend a message’s meaning.

NLP is a broad field divided into two main areas: natural language understanding and natural language generation. Although both fall under the umbrella of NLP, they have some key differences:

Natural language understanding involves analyzing the semantics and syntax of unstructured text.
Natural language generation is the process of using structured data to produce unstructured text.

This chapter explores the birth of NLP and spotlights some of the main individuals who spearheaded its evolution. It also highlights some real-world applications that use natural language generation to highlight NLP’s broad applicability and use.

The Evolution of NLP

After World War II, there was a considerable need for a machine to translate text from one language to another. Back in the day, the techniques used were very rudimentary: using dictionary lookups and hard-coded rules for ordering words.

In the late 1950s, researchers identified some problems with developing NLP. One of these researchers was Noam Chomsky, who observed that models used in NLP should be able to recognize grammatically correct sentences the same way humans do. He published his book Syntactic Structures in 1957, revolutionizing how linguistic concepts are represented in computers. He claimed that grammar is generative and introduced a mathematical model for representing language in computers.

Conflicting views emerged among researchers involved in developing NLP between 1957 and the 1970s, leading to the emergence of two groups of NLP: linguistics-based and statistical-based. Linguistic-based NLP followed linguistic rules, countering any statistical methods. On the other hand, statistical-based NLP group believed NLP could only use statistics. Linguistic-based NLP progressed significantly in this era, but the results were still lacking.

Until the 1980s, computer systems with NLP functionalities used complex “hard-coded” rules. Computer scientists realized that isolated problems to natural language problems were not effective. This challenge led to the incorporation of other fields of science, including linguistics. And as a result, more research was conducted to bridge the gap between computer science and linguistics.

As computational power increased, the late 1980s witnessed a shift to machine learning algorithms, and decision trees were one of the earliest adopted algorithms—they’re capable of using probability and statistics to play with NLP.

After the 2000s, researchers began using deep learning algorithms to solve problems in NLP. With the vast increase in textual data available, faster CPUs, and more memory, it became easier to train Large Language Models (LLMs), which use neural networks to offer better results.

To give you an example, researchers have used convolutional neural networks to classify sentences, perform sentiment analysis, and implement text summarization.

Moving forward, long short-term memory (LSTM) recurrent neural networks (RNNs) entered the market. During this period, the internet improved the availability of electronic text. Recurrent neural networks introduced sequence-to-sequence models ideal for sequential data, such as text, speech, and audio analysis. While LSTMs were effective for capturing sequential dependencies, they were still limited in their handling of longer sequences.

Nowadays, in modern NLP, the use of transformers also contributed to its development by solving problems such as question-answering systems, machine translation, and so on. Transformers in NLP is a novel architecture of neural networks that handle sequence-to-sequence tasks. They transform input sequences into output sequences efficiently in deep learning applications.

With so many advancements, the application of neural networks has grown immensely, and now there are large pre-trained models available that you can use to create NLP-powered applications.

Conclusion

You have finished the first chapter of this course and learned the definition of NLP and its two sub-fields: natural language understanding and natural language generation. You also learned about the evolution of NLP, the people who pioneered it, and how it progressed from a linguistic-based approach to a statistical-based one.

Over the following sections of this course, you will learn the basics of NLP, including pre-processing, classical machine learning methods, and ways to train and evaluate models. But first, let’s start with some of the most common and powerful applications of NLP!