 # Meaning and Objectives of Classification of Data

## Meaning of Classification of Data

•     It is the process of arranging data into homogeneous (similar) groups according to their common characteristics.
•     Raw data cannot be easily understood, and it is not fit for further analysis and interpretation. Arrangement of data helps users in comparison and analysis.
• For example, the population of a town can be grouped according to sex, age, marital status, etc.

## Classification of data

The method of arranging data into homogeneous classes according to the common features present in the data is known as classification.

A planned data analysis system makes the fundamental data easy to find and recover. This can be of particular interest for legal discovery, risk management, and compliance. Written methods and sets of guidelines for data classification should determine what levels and measures the company will use to organise data and define the roles of employees within the business regarding input stewardship.

Once a data -classification scheme has been designed, the security standards that stipulate proper approaching practices for each division and the storage criteria that determines the data’s lifecycle demands should be discussed.

## Objectives of Data Classification

The primary objectives of data classification are:

•     To consolidate the volume of data in such a way that similarities and differences can be quickly understood. Figures can consequently be ordered in sections with common traits.
•     To aid comparison.
•     To point out the important characteristics of the data at a flash.
•     To give importance to the prominent data collected while separating the optional elements.
• To allow a statistical method of the materials gathered.
 Definition of classification given by Professor. Secrist “Classification is the process of arranging data into sequences according to their common characteristics or separating them into different related parts.”

 Q.- What is meant by a variable? Explain its two kinds. Answer: (a) Meaning of variable ●     The term variable is derived from the word ‘vary’ that means to differ or change. Hence, variable means the characteristic that varies, differs, or changes from person to person, time to time, place to place, etc. ●     A variable refers to a quantity or attribute whose value varies from one investigation to another. ●     Example: “Price” is a variable as prices of different commodities are different. “Age” is a variable as the ages vary Some more examples are: height, weight, wages, expenditure, import, production, etc. (b) Kinds of variables: (I) Discrete variables ●     Variables that are capable of taking only an exact value and not any fractional value are termed as discrete variables. ●     For example, the number of workers or the number of students in a class is a discrete variable as they cannot be in fraction. Similarly, the number of children in a family can be 1, 2, and so on, but cannot be 1.5, 2.75. (II) Continuous variables ●     Variables that can take all the possible values (integral as well as fractional) in a given specified range are termed as continuous variables. ●     For example, temperature, height, weight, marks, etc.
Methods of classification

 Q.- Explain the basis or methods of classification. Answer: Following are the basis of classification: (1) Geographical classification ●     When data are classified with reference to geographical locations such as countries, states, cities, districts, etc., it is known as geographical classification. ●     It is also known as ‘spatial classification’. (2) Chronological classification ●     A classification where data are grouped according to time is known as a chronological classification. ●     In such a classification, data are classified either in ascending or in descending order with reference to time such as years, quarters, months, weeks, etc. ●     It is also known as temporal classification’. (3) Qualitative classification ●     Under this classification, data are classified on the basis of some attributes or qualities like honesty, beauty, intelligence, literacy, marital status, etc. ●     For example, the population can be divided on the basis of marital status (as married or unmarried) (4) Quantitative classification ●     This type of classification is made on the basis of some measurable characteristics like height, weight, age, income, marks of students, etc.
Statistical series

 Q.- What is a statistical series? Discuss the various kinds of statistical series. Answer: (a) Statistical series ●     Statistical series is a systematic arrangement of statistical data in some logical order. (b) Statistical series can be divided as: (I) On the basis of general characteristics On the basis of general characteristics, statistical series are of three kinds: (i) Time series (Chronological series) If the different values that a variable has taken in a period of time are arranged in a chronological order, the series so obtained is known as a time series. (ii) Spatial series (Geographical series) The data arranged according to location or geographical considerations form a spatial series. (iii) Condition series In this series, data are classified according to the changes occurring in variables according to a condition, such as height, weight, age, marks, income, etc. (II) On the basis of construction According to construction, statistical series can be categorised as : (i) Individual series Individual series refers to a series in which items are listed singly, i.e., each item is given a separate value of the measurement. Example: Marks (Out of 50) 20 30 10 30 40 50 45 40 42 40 (ii) Discrete series A discrete series is a series where individual values differ from each other by a definite amount. Example: Marks 12 25 35 45 49 No. of students 3 5 2 2 1 (iii) Continuous series A continuous series is a series that represents continuous variables, showing a range of values of different items of the series. Example: Marks 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 No. of students 1 4 5 6 4
Types of continuous series and their conversion

 Q.- Discuss the various types of continuous series. Answer: (A) Exclusive series Age (in years) No. of students ●     Frequency distribution having classes wherein: ●     The upper limit of one class becomes the lower limit of the next class. ●     For grouping or counting the number of observations, lower limit (l1) is considered but upper limit (l2) is not considered/included. 0 – 10 3 10 – 20 5 20 – 30 12 30 – 40 6 40 – 50 4 In the above example, – There are five classes.. – Class size = l2– l1 = 10 (for all) – Mid-value = (l2  + l1) ÷ 2 (B) Inclusive series Age (in years) No. of students ●     Frequency distribution having classes wherein: ●     The upper limit of one class is not equal to the lower limit of the next class. ●     For grouping or counting the number of observations, lower limit (l1) and upper limit (l2) are not considered/included. 0 – 9 3 10 – 19 5 20 – 29 12 30 – 39 6 40 – 49 4 (C) Mid- value series Mid- values (f) ●     Mid- value = (l1 + l2) ÷ 2 ●     Mid-value or mid-point is the central value of a class -interval. ●     When such mid-values are given, it is known as the mid-value series. 5 3 15 5 25 12 35 6 45 4 (D) Open- ended series (Distribution) Age (in years) No. of students ●     In a frequency distribution, if the lower limit (l1) of the first class and the upper limit (l2) of last class are not given, then it is known as “open-ended distribution”. Below 10 3 10 – 20 5 20 – 30 12 30 – 40 6 40 and above 4 (E) Continuous series with unequal intervals (X) (f) ●     When the class size, i.e., the gap between (l2) and (l1), is not equal in all the classes, it is known as unequal class interval series. ●     It can be converted into equal interval distribution by: ●     Merging the classes; ●     Splitting the classes. 0 – 10 3 10 – 15 5 15 – 30 12 30 – 40 6 40 – 45 4 (F) Cumulative frequency distribution:- “Less than Cf distribution” Age (in years) No. of students ●     Cumulative frequency series is a modification of the simple frequency distribution. ●     It is obtained by successively adding the frequencies of the values of the classes. Less than 10 3 Less than 20 8 Less than 30 20 Less than 40 26 Less than 50 30 “More than Cf distribution” Age (in years) No. of students More than 10 30 More than 20 27 More than 30 22 More than 40 10 More than 50 4

### Multiple Choice Questions:

 Q.1- Which of the following is the objective of classification? a. To condense the mass of data. b. To present data in a simple, logical, and understandable form. c. To bring out points of similarity and dissimilarity among various groups. d. All of the above Q.2- Temperature, height, weight, marks are an example of ________ . a. Discrete variables b. Continuous variables c. Both a. and b. d. None of the above

 Answer Key 1 – d., 2 – b.

 Q. no. Fill in the blanks 1 _________ of data is the process of arranging data into homogeneous groups according to their common characteristics.

 Q. no. Answer Key 1 Classification