Finishing Systems and Statistics

Posted on Monday, June 8, 2020

By John Claman

I took statistics in the math, business, and psychology departments at my university. (Yes, I was a glutton for punishment!) Following college, I worked in marketing research for years using many of the statistical tools I’d learned in college, but for the last several years, since coming to work for IntelliFinishing, I feel like my statistical skills are getting rusty, at best.

A while back, I found a statistics textbook while rummaging through a few books brought home by my college-age daughter. This brought back warm memories of my college days. I cracked open the textbook to read several beginning chapters–just to refresh–and immediately, I was struck by several techniques I’d forgotten and a few other cautionary notes about metrics I thought I was very knowledgeable about. Specifically, I started thinking of ways that these statistical methods might be helpful to the process of finishing products.

Most of the people I speak with about finishing systems started out “on the line” or are managers, engineers, or owners who know their business, but they are rarely statistics majors. Nor do they spend much time honing their statistical skills and knowledge. So, this article is intended to refresh the reader’s knowledge of some very basic statistical metrics and tools that may be useful in finishing, and as a reminder to crack open some of those math textbooks every once in a while.

An endless list of metrics can be important in the finishing of products. Most of you are probably calculating some of these measures or are at least familiar with them. Modern automated finishing systems can report many stats as part of their supervisory control and data acquisition (SCADA) software, which also controls the system. They may also be able to tie this data into other database-driven systems to create even more data and useful reports.

The following is a list of possible metrics categorized by process, throughput, and products, but are listed in no particular order and only represent a relatively short brainstorming session.

Process Metrics
• Pre-finish process timing.
• Total system part completion timing.
• Color change timing.
• Wash titrations.
• Ground measurements.
• Amount of powder or liquid used.
• Amount of powder reclaimed.
• OEE (Operational Equipment Effectiveness).
• Utility usage for natural gas, electricity, water, etc.
• Manpower used or required per time-frame.
• Total cost to operate overall and per process.
• Costs for consumable items and for replacement items like guns, lubricant, wearable items on conveyors and equipment, PPE, nozzles, etc.
• Line speeds.
• Heated wash stage temperatures and variation.
• Oven temperatures and variation within the oven and during part dry off or curing steps.
• Masking materials used.
• Racking materials use, reuse, cleansing, costs, etc.
• Safety incidents on the line.

Throughput Metrics
• Parts per time-frame.
• Total weight processed.
• Per shift/day-part/wk-day throughput.
• Line density.
• Carrier per time-frame.
• Total length of product processed.
• Square inch or foot of product finished.
• Total cost to operate per time-frame.

Product Metrics
• Mil thickness.
• Salt spray, chemical resistance, cross hatch, pencil hardness, abrasion, gloss, color variance, etc., test results.
• Reject rates.
• Square footage per part.
• Part dimensions including length, height, width, and substrate thickness.
• Scratch test results.
• Cross hatch test results.
• Redo rates.

I’m sure you can probably think of many more metrics important to your operation!

If we have a good idea of what to measure, what basic statistical tools should we use to analyze the data we gather and what common statistical mistakes should we avoid? One of the most common statistical tools you are likely to see presented is the average or mean. It’s a very powerful statistic that most of us are familiar with and understand…to a point.

If a few parts measured have especially thick coatings, the mean will be skewed toward the higher end of data rather than representing the middle value.

However, the mean or average can be misleading. Many know this, but it is often overlooked. The mean is reporting the central tendency of the data, but what if the data is skewed? For example, take mil coatings thicknesses. What if a few parts measured have especially thick coatings, while most are near the desired level? The mean will be pulled toward the higher end of data rather than representing the middle value. Management might, consequently, go on a rampage to reduce mil thickness (and paint costs) by the system or the painters, when in reality it’s just a few oddball occurrences that are throwing the mean higher than expected or desired. Perhaps understanding this pitfall of means would lead management to further investigate to discover that just one operator is tending to lay down paint or powder much thicker than others. Maybe this painter just needs more training and oversight, rather than a complete overhaul of the system or process.

Another problem with means is that there may not be a true central tendency. Measures could have multiple peaks, also known as modes, and on either end of the distribution. Therefore, the mean may suggest that most measures are near the figure reported when actually, many of the measures are much higher or much lower than the reported mean.

So, the first question to ask when seeing a mean score is about the underlying data. Is it roughly distributed in a bell-shaped curve, or at least symmetric around its central score? If not, you may be better off considering the median. The median is simply the value in the exact middle of all your measures when sorted from low to high. Imagine you measure throughput every day for a month. If you sort order your throughput from low to high, count the total number of measures, let’s say 30, and then look at the 15th measure (or to be more precise the average between the 15th and 16th measure), then you will have the medium. The median might more accurately represent throughput from Monday through Friday, but ignores the outliers of Friday, Saturday, and Sunday.

And, let’s not leave out the very often overlooked: mode! The mode is the most-often reported number in the list. For the mode, it’s sometimes enlightening to calculate how often it is reported. For example, if your most-often reported square footage part painted is 25 square inches, and it represents 50 percent of all parts painted, this may be more informative than simply reporting the average (mean) or the median.

So, to back up a bit, before you calculate any means, medians, or modes, try doing a histogram first. This is usually a bar chart of the data or a scatter point graph. This “picture” of the data can usually tell you much about which central tendency is appropriate, what outliers there may be, the minimums and the maximums, etc.

While there are many other statistical tools I could highlight, I want to especially include one I’d completely forgotten about since college. It’s called a “five-number summary.” The typical five-number summary includes the maximum value, the minimum value, the median, and the measures of the first quartile and the third quartile. The first quartile measure would be the top value among the bottom one-fourth of the values in sort order and the third quartile value is the value where three-quarters of the values are below the measure and one quarter are above the value. In effect, the median is the second quartile value for data.

An example of a five-number summary graphed using a box and leaf design.

The five-number summary is typically reported in this order:
1. Maximum value.
2. Q3 quartile value.
3. Median.
4. Q1 quartile value.
5. Minimum value.

Using a five-number summary (which can also be graphed using a box and leaf design) can tell you a lot about the range of the values you are considering as well as the data’s central and skewed tendencies.

Of course, a standard deviation calculation can also help determine the range of values and concentration around a mean, but it is also best for a “normal or close to bell-shaped distribution” – and now we are starting to get pretty deep into statistical concepts beyond the scope of this article.

Bottom line, I’m simply suggesting that we can be inadvertently misled by our over-reliance on familiar statistics if we don’t take the time to occasionally review our basic understanding of the subject. So, pick up your old textbooks (or borrow one from your children) and dive in once in a while. You might find something very useful that will help you better understand and possibly streamline your finishing operation!

John Claman is sales representative and marketing supervisor for IntelliFinishing.