Introduction
Data screening refers to the process of inspecting data to assess the existence of errors and correct them before data analysis, which is instrumental in increasing the overall credibility of the data (Fitrianto & Midi, 2011). This paper will assess the meaning of data screening, its goals, and establish the best approach to dealing with outliers and missing data.
Data Screening Goals
Increasing Data Reliability and Credibility
Data collection is sanctioned by the aim of establishing the dynamics within a specific phenomenon where different variables are assessed, and data collected to make a data-informed decision (Van den Broeck, Cunningham, Eeckels, & Herbst, 2005). However, the presence of outliers and missing data can negatively affect the overall credibility of the resulting data and its reliability in the decision-making process (Fitrianto & Midi, 2011). Data reliability is established by assessing for the data sufficiency and correcting any outliers and missing data.
Data Screening Seeks to Increase the Transformation and Standardization of Data
Data screening plays an instrumental role in assessing the statistical standard of data by improving the assumptions of the linearity, the normality, and the homogeneity of variance. The standardization of data is instrumental in enhancing the reliability of the data results, and the overall application of the data analysis results in making decisions (Van den Broeck et al., 2005).
Remedies
Errors of Data Entry
During data entry errors of data entry can negatively affect the overall data reliability. Data entry error is one of the areas during data processing, where the whole data collection and analysis process can be compromised (Van den Broeck et al., 2005). As such, creating data entry goals and standards is one of the sustainable remedies to prevent data entry errors. Creating an automated error reporting system and incorporating technological software tools such as Intelligent Character Recognition technology can help to complement the data entry goals and standards and help to prevent errors.
Outliers
Outliers refer to the sample units that have extreme values for the individual variables and have a significant impact on the outcome of the data analysis process, which results in an erroneous conclusion. Outliers fall away from the rest of the data and have three or more standard deviation from the mean (Van den Broeck et al., 2005). One of the best approaches to dealing with outliers is removing the variable from the records. Secondly, one can establish a cap on the outlier data, which can help to keep the variables within a specific limit. However, in the case where the outlier is due to an imputing error, one can assign a news value using regression to predict the missing value (Fitrianto & Midi, 2011).
Missing Data
In the case of missing data, multiple regression analysis is the best approach to estimate the missing value. Regression substitution helps in the prediction of the missing value from the other values in the data set (Fitrianto & Midi, 2011).
References
Fitrianto, A., & Midi, H. (2011). Procedures for generating a true clean data in simple mediation analysis. World Applied Sciences Journal, 15(7), 1046-1053. Retrieved from https://pdfs.semanticscholar.org/5017/5c21618fea51d7bfc2fd3b56a34269cfff74.pdf
Van den Broeck, J., Cunningham, S. A., Eeckels, R., & Herbst, K. (2005). Data cleaning: detecting, diagnosing, and editing data abnormalities. PLoS medicine, 2(10), e267. Retrieved from https://journals.plos.org/plosmedicine/article/file?id=10.1371/journal.pmed.0020267&type=printable
Cite this page
Data Screening: Goals, Outliers & Missing Data - Essay Sample. (2023, Feb 23). Retrieved from https://proessays.net/essays/data-screening-goals-outliers-missing-data-essay-sample
If you are the original author of this essay and no longer wish to have it published on the ProEssays website, please click below to request its removal:
- Hacking and Its Implication in the Modern World: Case Study Sample
- Essay on Disasters that Occur Because of a Company's Weak Computer Security Network
- Why We Need Strong Net Neutrality Rules Back?
- The Best New Practice Implementation Process Paper Example
- Research Paper on Google in the US and China
- Essay Example on VMware: Leading Visualization Software Provider Since 1998
- Paper Example on Cyberattacks: Trends, Patterns & Security Countermeasures