Quality Advisor

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

Sampling

A resource for data collection tools, including how to collect data, how much to collect, and how frequently to collect it.

What is it?

Sampling is a tool that is used to indicate how much data to collect and how often it should be collected. This tool defines the samples to take in order to quantify a system, process, issue, or problem.

To illustrate sampling, consider a loaf of bread. How good is the bread? To find out, is it necessary to eat the whole loaf? No, of course not. To make a judgment about the entire loaf, it is necessary only to taste a sample of the loaf, such as a slice. In this case the loaf of bread being studied is known as the population of the study. The sample, the slice of bread, is a subset or a part of the population.

Now consider a whole bakery. The population of interest is no longer a loaf, but all the bread that has been made today. A sample size of one slice from one loaf is clearly inadequate for this larger population. The sample collected will now become several loaves of bread taken at set times throughout the day. Since the population is larger, the sample will also be larger. The larger the population, the larger the sample required.

In the bakery example, bread is made in an ongoing process. That is, bread was made yesterday, throughout today, and will be made tomorrow. For an ongoing process, samples need to be taken to identify how the process is changing over time. Studying how the samples are changing with control charts will show where and how to improve the process, and allow prediction of future performance.

For example, the bakery is interested in the weight of the loaves. The bakery does not want to weigh every single loaf, as this would be too expensive, too time consuming, and no more accurate than sampling some of the loaves. Sampling for improvement and monitoring is a matter of taking small samples frequently over time. The questions now become:

  • How many loaves to weigh each time a sample is taken?
  • How often to collect a sample?

These two questions, “how much?” and “how often?” are at the heart of sampling.

When is it used?

  • Sampling is used any time data is to be gathered.
    Data cannot be collected until the sample size (how much) and sample frequency (how often) have been determined.
  • Sampling should be periodically reviewed.
    When data is being collected on a regular basis to monitor a system or process, the frequency and size of the sample should be reviewed periodically to ensure that it is still appropriate

How is it done?

  1. What questions are being asked of the data?
    Before collecting any data, it is essential to define clearly what information is required. It is easy to waste time and resources collecting either the wrong data, or not collecting enough information at the time of data collection. Try to anticipate questions that will be asked when analyzing the data. What additional information would be desirable? When collecting data, it is easy to record additional information; trying to track information down later is far more difficult, and may not be possible.
  2. Determine the frequency of sampling.
    The frequency of sampling refers to how often a sample should be taken. A sample should be taken at least as often as the process is expected to change. Examine all factors that are expected to cause change, and identify the one that changes most frequently. Sampling must occur at least as often as the most frequently changing factor in the process. For example, if a process has exhibited the behavior shown in the diagram below, how often should sampling occur in order to get an accurate picture of the process?
    Factors to consider might be changes of personnel, equipment, or materials. The questions identified in step 1 may give guidance to this step.Common frequencies of sampling are hourly, daily, weekly, or monthly. Although frequency is usually stated in time, it can also be stated in number: every tenth part, every fifth purchase order, every other invoice, for example. If it is not clear how frequently the process changes, collect data frequently, examine the results, and then set the frequency accordingly.
  3. Determine the actual frequency times.
    The purpose of this step is to state the actual time to take the samples. For instance, if the frequency were determined to be daily, what time of day should the sample be taken—in the morning at 8:00 am, around midday, or late in the day around 5:00 pm? This is important because inconsistent timing between data gathering times will lead to data that is unreliable for further analysis. For example, if a sample is to be taken daily, and on one day it is taken at 8:00 am, the next day at 5:00 pm, and the following day at midday, the timing between the samples is inconsistent and the collected data will also be inconsistent. The data will exhibit unusual patterns and will be less meaningful. Stating the time that the sample is to be taken will reduce this type of error. The actual time should be chosen as close to any expected changes in the process as possible, and when taking a sample will be convenient. Avoid difficult times, such as during a shift change or lunch break.”
  4. Select the subgroup (sample) size.
    A subgroup (or sample) is the number of items to be examined at the same time. The terms “subgroup” and “sample” may be used interchangeably. When doing calculations, subgroup size is denoted by the letter n. To choose the most appropriate subgroup size, determine first whether the data being collected is “variables data” or “attributes data.”
    For variables data: When measuring variables data, a subgroup size larger than one is preferable because larger subgroups sizes yield greater possibilities for analysis. However, it may not be possible to get a subgroup size larger than one. Some examples of this are electricity usage per month, profit per month, sales per month, temperature of a room, and the viscosity of a fluid. In situations such as these when a subgroup size larger than one does not make sense, the subgroup (or sample) size is equal to one.If a subgroup size larger than one can be chosen, the size is usually between three and eight. A subgroup size between three and eight has been determined to be statistically efficient. The most commonly-used subgroup size is five. When more data is desired, the frequency of taking samples, not the subgroup size, should be increased.When a sample is taken, it should be selected to assure that conditions within the sample are similar. If gathering a sample size of five, for example, take all five pieces in a row as they are produced in the process. This is known as a rational subgroup.For attributes data: The subgroup size for attributes data depends on the process being sampled. The general rule of thumb is to gather a large enough sample so that all possible characteristics being investigated will appear. That is, the sample is large enough that a “0” occurrence is rare.

    Begin by answering the question, “How many items does this process produce during the frequency interval (per hour, week, etc.)?” When that number is determined, the sample size should be at least the square root of that number. For instance, if a purchasing department processes 100 purchase orders per week, an appropriate sample size would be 10 purchase orders per week (the square root of 100 is 10.)

The above article is an excerpt from the “Sampling” chapter of Practical Tools for Continuous Improvement: Volume 1 – Statistical Tools. The full chapter provides more details on sampling.

Quality Advisor

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

Data Analysis Tools

Tools for analyzing and interpreting data so that areas to improve become apparent.

What type of data do I have?

Variables charts (measurement data)

(Learn more)

Consists of measurements of a characteristic, such as length, weight, density, time, or pressure.

Control charts Is your process stable and in control?
X-bar & range Use this if your data has a subgroup size of 2-10 observations.
X-bar & sigma Use this if your data has a subgroup size of 11 or more observations.
X-MR Use this if your data has a subgroup size of 1 observation.
Median Use this to analyze measurement data when you want to plot all observations.
Run chart Use this to see trends and patterns if there is not enough data for a control chart.
Histogram Use this to determine if your data has a normal distribution.
Capability analysis Use this to determine if your process is capable of producing output within specification limits.

 

Attributes (counts data)

(Learn more)

Consists of defects per item (nonconformities) or the number of defective items (nonconforming). For example, the number of non-working parts in sample or the number of blemishes counted on an individual part.

Control charts Is your process stable and in control?
np-chart Use this if your data is a count of nonconforming units and the subgroups are all the same size.
p-chart Use this if your data is a count of nonconforming units and the subgroup size varies.
c-chart Use this if your data is a count of nonconformities and the subgroups are all the same size.
u-chart Use this if your data is a count of nonconformities and the subgroup size varies.
Capability analysis Use this to determine capability for attributes data.

 

Pareto (counts in categories)

(Learn more)

Consists of a count of items or occurrences, such as the number of defective items, the number of scratches on a door panel, or how often a specific problem occurs.

Pareto diagram Use this to analyze counts that are in categories.

 

Rare event

(Learn more)

Use this when other control charts are not effective to determine if your process is stable.

g-chart Use this if your count data occurs infrequently. It is used by counting the number of events between rarely-occurring error or a nonconforming incident.
t-chart Use this if your error or non-nonconforming incident occurs infrequently. Each point on the chart represents an amount of time that has passed since the prior nonconforming incident occurred.

 

Interpreting quality charts

Quality Advisor

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

Data Collection Tools

A resource for data collection tools, including how to collect data, how much to collect, and how frequently to collect it.

Sampling

A tool used to indicate how much data to collect and how often it should be collected.

Learn More

Operational Definition

A clear, concise, detailed definition of a measure.

Learn More

Improving Measurement Accuracy with Gage R&R

Gage R&R refers to testing the repeatability and reproducibility of the measurement system.

Learn More

Formulas and Tables

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

Process performance indices

Process performance indices use sigma of the individuals.

Pp

Pp for one-sided specifications

If you are using one sided specifications, use the following formulas to determine the Cp:

Upper specification

Lower specification

Ppk

Where:

Zmin is the smaller of Zupper and Zlower.

Using sigma of the individuals:

Pr

Formulas and Tables

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

Capability indices

Capability indices use estimated sigma.

Cp

Cp for one-sided specifications

If you are using one sided specifications, use the following formulas to determine the Cp:

Upper specification

Lower specification

Cr

Cpk

f-cpk

Where:

Zmin is the smaller of Zupper and Zlower.

Using estimated sigma:

Cpm

Where: sigma for cpm

T = specification target (nominal)

Xi = a given individual reading of ” i ”

n = total number of individual readings

= symbol for summation

Formulas and Tables

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

t-chart formula

The t-chart formula:

Formulas and Tables

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

g-chart formula

The g-chart formula:

Formulas and Tables

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

c-chart formulas

The c-chart formula (for number of nonconformities, from subgroups of a constant size):

Formulas and Tables

Default White Guide

SPC DEMO

Minimize Production Costs, Quickly Detect Issues, and Optimize Your Product Quality

Don’t miss out! Book a demo of our specialized SPC software and unlock immediate improvements in your processes.

u-chart formula

The u-chart formula (for number of nonconformities from subgroups that can vary in size):