The focus on ‘big data’ is often misplaced, as it can be both costly and unnecessary for many organizations. Instead, organizations should focus on three ‘small data’ rules: the ‘Rule of 30’, the ‘Rule of Proportions’, and the ‘Rule of Common Sense’.
In general, a sample size of 30 is viewed as large enough to be statistically significant for almost all data sets. But while 30 will result in relevant statistical results for comparison between groups, different actual group population sizes will necessitate adjustments in order to obtain representative results for the entire population. This adjustment can be done in two ways. The sample can be adjusted such that group sample sizes are based on actual population proportions. A second adjustment method is to use 30 as the sample size of each population, but to weight sample size results based on the actual proportion of the population.
The Rule of Common Sense is by far the most important ‘small data’ rule. Given the complexity that could arise in ensuring perfect ‘big data’ analysis, it is important to temper the science of statistical ‘big data’ analysis with common sense, the most important asset of a human being. Given the subjective nature of most databases, organizations will not benefit substantially from maximizing precision through ‘big data’ analysis.
‘Small data’ beats ‘big data’ most days of the week.