Natural Language Processing Engineer Question:
Download Job Interview Questions and Answers PDF
You have created a document term matrix of the data, treating every tweet as one document. Which of the following is correct, in regards to document term matrix?
Removal of stopwords from the data will affect the dimensionality of data
Normalization of words in the data will reduce the dimensionality of data
Converting all the words in lowercase will not affect the dimensionality of the data
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1, 2 and 3
Answer:
D) 1 and 2
Choices A and B are correct because stopword removal will decrease the number of features in the matrix, normalization of words will also reduce redundant features, and, converting all words to lowercase will also decrease the dimensionality.
Choices A and B are correct because stopword removal will decrease the number of features in the matrix, normalization of words will also reduce redundant features, and, converting all words to lowercase will also decrease the dimensionality.
Download Natural Language Processing Engineer Interview Questions And Answers
PDF