Descriptive statistics summarize and organize characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population.
In quantitative research, after collecting data, the first step of statistical analysis is to describe characteristics of the responses, such as the average of one variable (e.g., age), or the relation between two variables (e.g., age and creativity).
The next step is inferential statistics, which help you decide whether your data confirms or refutes your hypothesis and whether it is generalizable to a larger population.
Types of descriptive statistics
There are 3 main types of descriptive statistics:
The distribution concerns the frequency of each value.
The central tendency concerns the averages of the values.
The variability or dispersion concerns how spread out the values are.
You can apply these to assess only one variable at a time, in univariate analysis, or to compare two or more, in bivariate and multivariate analysis.
Research example
You want to study the popularity of different leisure activities by gender. You distribute a survey and ask participants how many times they did each of the following in the past year:
Go to a library
Watch a movie at a theater
Visit a national park
Your data set is the collection of responses to the survey. Now you can use descriptive statistics to find out the overall frequency of each activity (distribution), the averages for each activity (central tendency), and the spread of responses for each activity (variability).
Frequency distribution
A data set is made up of a distribution of values, or scores. In tables or graphs, you can summarize the frequency of every possible value of a variable in numbers or percentages. This is called a frequency distribution.
Simple frequency distribution table Grouped frequency distribution table
For the variable of gender, you list all possible answers on the left hand column. You count the number or percentage of responses for each answer and display it on the right hand column.
Measures of central tendency
Measures of central tendency estimate the center, or average, of a data set. The mean, median and mode are 3 ways of finding the average.
Here we will demonstrate how to calculate the mean, median, and mode using the first 6 responses of our survey.
Mean Median Mode
The mean, or M, is the most commonly used method for finding the average.
To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observations is called N.
Measures of variability
Measures of variability give you a sense of how spread out the response values are. The range, standard deviation and variance each reflect different aspects of spread.
Range
The range gives you an idea of how far apart the most extreme response scores are. To find the range, simply subtract the lowest value from the highest value.
Range of visits to the library in the past year
Ordered data set: 0, 3, 3, 12, 15, 24
Range: 24 – 0 = 24
Standard deviation
The standard deviation (s or SD) is the average amount of variability in your dataset. It tells you, on average, how far each score lies from the mean. The larger the standard deviation, the more variable the data set is.
There are six steps for finding the standard deviation:
List each score and find their mean.
Subtract the mean from each score to get the deviation from the mean.
Square each of these deviations.
Add up all of the squared deviations.
Divide the sum of the squared deviations by N – 1.
Find the square root of the number you found.
Standard deviations of visits to the library in the past year
You complete Steps 1 through 4.
Step 5: 421.5/5 = 84.3
Step 6: √84.3 = 9.18
From learning that s = 9.18, you can say that on average, each score deviates from the mean by 9.18 points.
Variance
The variance is the average of squared deviations from the mean. Variance reflects the degree of spread in the data set. The more spread the data, the larger the variance is in relation to the mean.
To find the variance, simply square the standard deviation. The symbol for variance is s2.
Variance of visits to the library in the past year
Data set: 15, 3, 12, 0, 24, 3
s = 9.18
s2 = 84.3
Univariate descriptive statistics
Univariate descriptive statistics focus on only one variable at a time. It’s important to examine data from each variable separately using multiple measures of distribution, central tendency and spread. Programs like SPSS and Excel can be used to easily calculate these.
If you were to only consider the mean as a measure of central tendency, your impression of the “middle” of the data set can be skewed by outliers, unlike the median or mode.
Likewise, while the range is sensitive to outliers, you should also consider the standard deviation and variance to get easily comparable measures of spread.
Bivariate descriptive statistics
If you’ve collected data on more than one variable, you can use bivariate or multivariate descriptive statistics to explore whether there are relationships between them.
In bivariate analysis, you simultaneously study the frequency and variability of two variables to see if they vary together. You can also compare the central tendency of the two variables before performing further statistical tests.
Multivariate analysis is the same as bivariate analysis but with more than two variables.
Contingency table
In a contingency table, each cell represents the intersection of two variables. Usually, an independent variable (e.g., gender) appears along the vertical axis and a dependent one appears along the horizontal axis (e.g., activities). You read “across” the table to see how the independent and dependent variables relate to each other.
Interpreting a contingency table is easier when the raw data is converted to percentages. Percentages make each row comparable to the other by making it seem as if each group had only 100 observations or participants. When creating a percentage-based contingency table, you add the N for each independent variable on the end.
From this table, it is more clear that similar proportions of children and adults go to the library over 17 times a year. Additionally, children most commonly went to the library between 5 and 8 times, while for adults, this number was between 13 and 16.
Scatter plots
A scatter plot is a chart that shows you the relationship between two or three variables. It’s a visual representation of the strength of a relationship.
In a scatter plot, you plot one variable along the x-axis and another one along the y-axis. Each data point is represented by a point in the chart.
Scatter plot example: Library visits and movie theater visits
You investigate whether people who visit the library more tend to watch a movie at a theater less. You plot the number of times participants watched movies at a theater along the x-axis and visits to the library along the y-axis.
From your scatter plot, you see that as the number of movies seen at movie theaters increases, the number of visits to the library decreases. Based on your visual assessment of a possible linear relationship, you perform further tests of correlation and regression.
Frequently asked questions about descriptive statistics
|
What’s the difference between descriptive and inferential statistics? Descriptive statistics summarize the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalizable to the broader population.
What are the 3 main types of descriptive statistics? The 3 main types of descriptive statistics concern the frequency distribution, central tendency, and variability of a dataset.
What’s the difference between univariate, bivariate and multivariate descriptive statistics?
|
Bhandari, P. (2023, June 21). Descriptive Statistics | Definitions, Types, Examples. Scribbr. Retrieved July 27, 2024.
논문과 관련하여 도움이 필요한 경우 친절하게 상담하고 있으니 편한 마음으로, 전화, 홈페이지, 카톡, 톡톡 등을 통해 상담을 받아보세요~
대표번호 : 02-554-0805
고객센터 : 1899-0805
24시간카톡상담 : brainphd
이메일 : info5044@brainphd.co.kr
#Article #Research #Paper #논문컨설팅 #석사논문 #박사논문 #공학논문 #사회복지논문 #건축학논문 #서울대박사 #논문통계 #SPSS #SCI논문 #학위논문 #논문교정 #부산논문 #대전논문 #간호학논문 #경영학논문 #마케팅논문 #음악논문 #미술논문 #교육학논문 #심리학논문 #의학논문