Research Paper on Collecting Data in Statistics: Primary vs Secondary Methods

Paper Type: Essay

Pages: 4

Wordcount: 1049 Words

Date: 2023-03-12

Categories:

Introduction

In statistics, there are numerous methods for collecting data. However, the appropriateness of each data-gathering technique depends on the type and the circumstance understand. As such, approaches for collecting primary or secondary data are not the same. Methods for obtaining primary data are experiments, surveys, interviews, and observations (Tan, 2017). On the other hand, sources of secondary data include books, journals, databases, internets, records, newspapers, and publications, among others. In this case, the interview, questionnaire, and internet are the two methods appropriate for the data collection. The Paralyzed Veterans of America stores data about donations in its database, and therefore the researcher can obtain the same from its websites.

Similarly, the research can collect the required data through questionnaires. The questionnaire provides the respondent with an opportunity to give feedback according to the questions asked. Also, the interview is crucial for obtaining data for the Paralyzed Veterans of America contributors. In the case of interview method, the interviewer interacts one on one with the interviewee to get information. The management is the relevant authority to obtain the data.

Statistical Data Analysis Technique and Assumptions

The discussion seeks to assess the relationship between the predictor variables and the response variable. The amount of donations is the dependent variable, while the past contributions are the independent variables (predictor). The study seeks to use quantitative data analysis techniques to assess the association between the amount of donations and the independent variables. Mainly, the discussion employs correlation analysis and graphical presentation to predict the relationship between variables. Correlation analysis is the statistical approach used by researchers to examine the strength of association between variables (Cohen, West, & Aiken, 2014). The nature of the relationship between variables under study can be either positive or negative. Besides, the association can be strong, moderate, or weak, depending on the coefficient of correlation. A positive coefficient of correlation shows that the variables analyzed the change in the same direction, while a negative one demonstrates that the variables move in the opposite direction (Hayes, 2017). When Pearson's correlation coefficient tends towards 1, the connection between variables is strong and weak when nears 0 (Srivastava, 2008). The statistical assumptions about the data are the data is continuous, there is a direct link between the data, there is no or insignificant outliers, and that data is normally distributed (Welsh, 2011).

Justification for the Selected Data Analysis Methods

The statistical techniques used are correlation analysis and graphical approach. Correlation aids in finding the connection between the independent and dependent variables. In this case, the discussion seeks to determine the association between the amount of gift from donors and the 26 predictor variables. Correlation analysis is suitable for establishing a functional as well as a linear relationship between variables evaluated (Ranjan, 2013). Both graphical and correlation generate a model for predicting whether various independent variables plays a part in determining the total donation received from donors. Correlation study generates the correlation coefficient, which denotes the level of connection between variables under scrutiny (Bruce & Bruce, 2017). The Pearson correlation coefficient indicates whether the relationship is positive or negative. When it is positive, it implies that a specific independent variable moves in the same direction with the dependent variable. For instance, the correlation coefficient between the last gift donated and the gift amount is 0.7. It implies that the association between the two variables is positive in that the gift amount increases with the previous donated gift. By examining the correlation coefficient, it will be possible to know those factors that improve the overall donation and those that do not. The result is also significant because it will help in identifying areas of improvement for future performance.

Correlation Analysis Process for Solving the Problem

The excel software is appropriate for analyzing the sampled data to generate the correlation coefficient. The Pearson coefficient helps in interpreting the connection between the variables. When it is high, the relationship is strong, and if low, it is weak. The coefficient will help spot which variable has more or less impact on the predicted variable (Kumar & Chong, 2018). The analysis shows that most of the variables have a very minimal influence on the amount of the donation. Variables with a relative impact are the amount of the smallest and largest gift to data, average amount of contribution, and the last gift.

Application of Data Mining to Solve the Problem

The data mining uses metadata to predict pattern and relationship between variables. From the association observed, one can predict the model of a specific data set for useful decision making (MacLennan, Tang, & Crivat, 2011). In this case, the organization can use the noted link between variables to develop specific approaching for increasing efficiency when requesting donations for donors.

Structured or Unstructured Problem

A structured problem uses a single step to achieve an optimal solution, while the unstructured problem has no predefined way of solving it due to uncertainty or lack of enough information about the issue (Oz, 2004). In this case, the problem is unstructured because it has no known method of solving it.

Assessing How the Variables can Answer the Problem

The variables have the potential of solving the problem by giving more attention to variables with a positive correlation coefficient and ignore those with a negative and weak connection with the donation amount.

Reference

Bruce, P., & Bruce, A. (2017). Practical statistics for data scientists: 50 essential concepts. O'Reilly Media, Inc.

Cohen, P., West, S. G., & Aiken, L. S. (2014). Applied multiple regression/correlation analysis for the behavioral sciences. Psychology Press.

Hayes, A. F. (2017). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. Guilford Publications.

Kumar, S., & Chong, I. (2018). Correlation analysis to identify the effective data in machine learning: Prediction of depressive disorder and emotion states. International journal of environmental research and public health, 15(12), 2907.

MacLennan, J., Tang, Z., & Crivat, B. (2011). Data mining with Microsoft SQL server 2008. John Wiley & Sons.

Oz, E. (2004). Management information systems. Course Technology.

Ranjan., P. (2013). How far is correlation justified as a means of analysis for data types that have a well distributed scatter plot? Indian Institute of Technology Madras Retrieved from https://www.researchgate.net/

Srivastava, T. N. (2008). Statistics for management. Tata McGraw-Hill Education.

Tan, W. (2017). Research Methods: A Practical Guide for Students and Researchers. World Scientific Publishing Company.

Welsh, A. H. (2011). Aspects of statistical inference (Vol. 916). John Wiley & Sons

Cite this page

Research Paper on Collecting Data in Statistics: Primary vs Secondary Methods. (2023, Mar 12). Retrieved from https://proessays.net/essays/research-paper-on-collecting-data-in-statistics-primary-vs-secondary-methods

If you are the original author of this essay and no longer wish to have it published on the ProEssays website, please click below to request its removal: