Learn NLP in Python — text preprocessing, machine learning, transformers & LLMs using scikit-learn, spaCy & Hugging Face
What you’ll learn
- Review the history and evolution of NLP techniques and applications, from traditional machine learning models to modern LLM approaches
- Walk through the NLP text preprocessing pipeline, including cleaning, normalization, linguistic analysis, and vectorization
- Use traditional machine learning techniques to perform sentiment analysis, text classification, and topic modeling
- Understand the theory behind neural networks and deep learning, the building blocks of modern NLP techniques
- Break down the main parts of the Transformers architecture, including embeddings, attention and feedforward neural networks (FFNs)
- Use pretrained LLMs with Hugging Face to perform sentiment analysis, NER, zero-shot classification, document similarity, and text summarization & generation
Requirements
- We strongly recommend taking our Data Prep & EDA with Python course first
- Jupyter Notebooks (free download, we’ll walk through the install)
- Familiarity with base Python and Pandas is recommended, but not required
Description
This is a practical, hands-on course designed to give you a comprehensive overview of all the essential concepts for modern Natural Language Processing (NLP) in Python.
We’ll start by reviewing the history and evolution of NLP over the past 70 years, including the most popular architecture at the moment, Transformers. We’ll also walk through the initial text preprocessing steps required for modeling, where you’ll learn how to clean and normalize data with pandas and spaCy, then vectorize that data into a Document-Term Matrix using both word counts and TF-IDF scores.
After that, the course is split into two parts:
- The first half covers traditional machine learning techniques
- The second half covers modern deep learning and LLM (large language model) approaches
For the traditional NLP applications, we’ll begin with Sentiment Analysis to determine the positivity or negativity of text using the VADER library. Then we’ll cover Text Classification on labeled data with Naïve Bayes, as well as Topic Modeling on unlabeled data using Non-Negative Matrix Factorization, all using the scikit-learn library.
Once you have a solid understanding of the foundational NLP concepts, we’ll move on to the second half of the course on modern NLP techniques, which covers the major advancements in NLP and the data science mindset shift over the past decade.
We’ll start with the basic building blocks of modern NLP techniques, which are neural networks. You’ll learn how neural networks are trained, become familiar with key terms like layers, nodes, weights, and activation functions, and then get introduced to popular deep learning architectures and their practical applications.
After that, we’ll talk about Transformers, the architectures behind popular LLMs like ChatGPT, Gemini, and Claude. We’ll cover how the main layers work and what they do, including embeddings, attention, and feedforward neural networks. We’ll also review the differences between encoder-only, decoder-only, and encoder-decoder models, and the types of LLMs that fall into each category.
Last but not least, we’re going to apply what we’ve learned with Python. We’ll be using Hugging Face’s Transformers library and their Model Hub to demo six practical NLP applications, including Sentiment Analysis, Named Entity Recognition, Zero-Shot Classification, Text Summarization, Text Generation, and Document Similarity.
COURSE OUTLINE:
- Installation & Setup
- Install Anaconda, start writing Python code in a Jupyter Notebook, and learn how to create a new conda environment to get set up for this course
- Natural Language Processing 101
- Review the basics of natural language processing (NLP), including key concepts, the evolution of NLP over the years, and its applications & Python libraries
- Text Preprocessing
- Walk through the text preprocessing steps required before applying machine learning algorithms, including cleaning, normalization, vectorization, and more
- NLP with Machine Learning
- Perform sentiment analysis, text classification, and topic modeling using traditional NLP methods, including rules-based, supervised, and unsupervised machine learning techniques
- Neural Networks & Deep Learning
- Visually break down the concepts behind neural networks and deep learning, the building blocks of modern NLP techniques
- Transformers & LLMs
- Dive into the main parts of the transformer architecture, including embeddings, attention, and FFNs, as well as popular LLMs for NLP tasks like BERT, GPT, and more
- Hugging Face Transformers
- Introduce the Hugging Face Transformers library in Python and walk through examples of how you can use pretrained LLMs to perform NLP tasks, including sentiment analysis, named entity recognition (NER), zero-shot classification, text summarization, text generation, and document similarity
- NLP Review & Next Steps
- Review the NLP techniques covered in this course, when to use them, and how to dive deeper and stay up-to-date
__________
Ready to dive in? Join today and get immediate, LIFETIME access to the following:
- 12.5 hours of high-quality video
- 13 homework assignments
- 4 interactive exercises
- Natural Language Processing in Python ebook (200+ pages)
- Downloadable project files & solutions
- Expert support and Q&A forum
- 30-day Udemy satisfaction guarantee
If you’re an aspiring or seasoned data scientist looking for a practical overview of both traditional and modern NLP techniques in Python, this is the course for you.
Happy learning!
-Alice Zhao (Python Expert & Data Science Instructor, Maven Analytics)
Who this course is for:
- Aspiring Data Scientists who want a practical overview of natural language processing techniques in Python
- Seasoned Data Scientists looking to learn the latest NLP techniques, such as Transformers, LLMs and Hugging Face
Click here to view the full details of the resource.:URL
Click the button below to download.
Download: