Database Analyst Question:
Tell us what are the best practices for data cleaning?
Answer:
☛ Separate data depending on their attributes
☛ In the case of massive datasets, do a stepwise cleansing and improve on the data on every step until the data quality is good.
☛ For common data cleansing, you need to generate a set of scripts which include blanking out every value not matching a regex.
☛ Do analysis on the statistic for every column.
☛ Stay up to date with all cleaning operations, so changes could make when necessary.
☛ In the case of massive datasets, do a stepwise cleansing and improve on the data on every step until the data quality is good.
☛ For common data cleansing, you need to generate a set of scripts which include blanking out every value not matching a regex.
☛ Do analysis on the statistic for every column.
☛ Stay up to date with all cleaning operations, so changes could make when necessary.
Previous Question | Next Question |
Tell us what is the GROUP BY statement used for? | Explain me what are the responsibilities of a Data Analyst? |