Text Preprocessing in NLTK

Test how well you can clean,normalize,and prepare text data for NLP tasks.

1. What is the process of splitting a text into individual words or tokens called in NLTK?
2. Which of the following are stemmers available in NLTK? (Select all that apply)
3. NLTK's WordNetLemmatizer requires specifying part-of-speech (POS) tags to accurately lemmatize words that are not nouns.
4. What is the name of the NLTK function used to download resources like stopwords? (exact function name)
5. Which NLTK corpus contains a list of common stopwords for various languages?
6. Which of the following are common steps in text preprocessing using NLTK? (Select all that apply)
7. Stemming in NLTK always produces valid English words as output.
8. What is the output of nltk.word_tokenize("Hello, world!")? (provide tokens as comma-separated strings without spaces)
9. What key difference distinguishes lemmatization from stemming in NLTK?
10. Which NLTK tools are used for sentence tokenization? (Select all that apply)
Answered 0 of 0 — 0 correct