Numeric variables have values that describe a measurable quantity as a number, like ‘how many’ or ‘how much’. Therefore numeric variables are quantitative variables. Categorical variables have values that describe a ‘quality’ or ‘characteristic’ of a data unit, like ‘what type’ or ‘which category’. Therefore, categorical variables are qualitative variables and tend to be represented by a non-numeric value. A continuous variable is a numeric variable. Observations can take any value between a certain set of real numbers. The value given to an observation for a continuous variable can include values as small as the instrument of measurement allows. Examples of continuous variables include height, time, age, and temperature. A discrete variable is a numeric variable. Observations can take a value based on a count from a set of distinct whole values. A discrete variable cannot take the value of a fraction between one value and the next closest value. Examples of discrete variables include the number of registered cars, number of business locations, and number of children in a family, all of which measured as whole units (i.e. 1, 2, 3 cars). An ordinal variable is a categorical variable. Observations can take a value that can be logically ordered or ranked. Examples of ordinal categorical variables include academic grades (i.e. A, B, C), clothing size (i.e. small, medium, large, extra large) and attitudes (i.e. strongly agree, agree, disagree, strongly disagree). A nominal variable is a categorical variable. Observations can take a value that is not able to be organized in a logical sequence. Examples of nominal categorical variables include sex, business type, eye color, religion and brand (What are Variables?)
Frequency table and its associated terms
A statistical data may consist of a list of numbers related to a research. Among those numbers, few may be repeated twice and even more than twice. The repetition of number is a data set is termed as frequency of that particular number or the variable in which that number is assigned. The frequencies of variables in a data are to be listed in a table. This table is known as frequency distribution table and the list is referred as frequency distribution (Frequency Distribution)
Here are the types of frequency distributions:
Grouped frequency distribution: it is an arrangement class intervals and corresponding frequencies in a table.
Ungrouped frequency distribution: it is an interval width of 1 and arrangement of the observed values in ascending order. Data are not arranged in groups.
Cumulative frequency distribution: in this distribution, the frequencies are shown in the cumulative manner. It can be defined as the sum of all previous frequencies up to the current point.
Relative frequency distribution: If the frequency of the frequency distribution table is changed into relative frequency then frequency distribution table is called as relative frequency distribution table. For a data set consisting of n values.
Relative cumulative frequency distribution: It is the cumulative frequency divided by the total frequency (Frequency Distribution)
Example of a frequency table and relative frequency table
Relative Frequency Percentage
Chart and Graph
Frequency distribution chart is a set of vertical bars whose areas are proportional to the frequencies. In the histogram, variable is always taken on the horizontal axis and frequencies on the vertical axis. The graphs are used to reveal the characteristics of discrete and continuous data. Two frequency distributions can be compared by the shapes and patterns. (Frequency Distribution)
Displaying and Exploring data
A dot plot groups the data in as little space as possible and the identity of an individual observation is not lost. To develop a dot plot, each observation is simply displayed as a dot along a horizontal number line indicating the possible values of the data.
Stem and leaf: One technique that is used to display quantitative information in a condensed form is the stem and leaf display. It is a statistical technique to present a set of data. Each numerical value is divided into two parts. The leading digit becomes the stem and the trailing digit the leaf. The stems are located along the vertical axis and the leaf values are stacked against each other along the horizontal axis.
Box plot: it is a graphical display, based on quartiles, that helps us picture a set of data. To construct a box plot, we need five statistics;
1. The minimum value
2. The first quartile (Q1)
3. The median
4. The third quartile (Q3) and
5. The maximum value
Skewness: Another characteristic of a set of data is the shape. There are four shapes commonly observed;
2. Positively skewed
3. Negatively skewed
The coefficient of skewness can range from -3 to +3. A value near -3, indicates negative skewness, a value such as 1.63 indicates moderate positive skewness and a value of 0, which will occur when the mean and median are equal, indicates the distribution is symmetrical and that there is no skewness present.
PEARSON’S COEFFICIENT OF SKEWNESS, sk= 3(x?- Median) / s
SOFTWARE COEFFICIENT OF SKEWNESS, sk = n / (n-1) (n-2) x ? (x-x?/s)3
Describing relationship between two variables:
When we study the relationship between two variables we refer to the data as bivariate. One graphical technique we use to show the relationship between variables is called scatter diagram. To draw a scatter diagram we need two variables. We scale one variable along the horizontal axis of a graph and the other variable along the vertical axis
Contingency Tables: A contingency table is a cross-tabulation that simultaneously summarizes two variables of interest. For examples:
1. Students at a university are classified by gender and class rank.
2. A product is classified as acceptable or unacceptable and by the shift (day, afternoon, or night) on which it is manufactured. (McGraw/Hill, 2015)
Probability is similar to percentage. Probability is a branch of mathematics that deals with calculating the likelihood of a given event’s occurrence, which is expressed as a number between 1 and 0 (TechTarget). When the probability is 0, that is an impossible event and if the probability is 1, that is a sure event. For example, when child will born, if the probability of girl child is 0.6 then the probability of boy child will be 0.4 because total should be 1.