Natural Language Processing Engineer Question:

Download Job Interview Questions and Answers PDF

You have created a document term matrix of the data, treating every tweet as one document. Which of the following is correct, in regards to document term matrix?

Removal of stopwords from the data will affect the dimensionality of data
Normalization of words in the data will reduce the dimensionality of data
Converting all the words in lowercase will not affect the dimensionality of the data
A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1, 2 and 3

Natural Language Processing Engineer Interview Question
Natural Language Processing Engineer Interview Question

Answer:

D) 1 and 2

Choices A and B are correct because stopword removal will decrease the number of features in the matrix, normalization of words will also reduce redundant features, and, converting all words to lowercase will also decrease the dimensionality.

Download Natural Language Processing Engineer Interview Questions And Answers PDF

Previous QuestionNext Question
N-grams are defined as the combination of N keywords together. How many bi-grams can be generated from given sentence:

“Analytics Vidhya is a great source to learn data science”

A) 7
B) 8
C) 9
D) 10
E) 11
True or False: Word2Vec model is a machine learning model used to create vector notations of text objects. Word2vec contains multiple deep neural networks

A) TRUE
B) FALSE