SOME MINITAB TUTORIALS

From time to time, I will post here some segments or short tutorials for doing things in Minitab. Here, I want to emphasize basic operations ... and not highly specialized routines. If you have any suggestions, please send me a note ... Dennis Roberts

MAKING A GRAPH LOOK THE WAY YOU WANT IT TO

Many times, when you have data and make a graph, the default way that it comes out is not exactly the way you would like to see it. What can you do? Well, in Minitab, there are many things you can do to enhance the graph. In this short tutorial, I want to focus on two graph types ... histograms and scatter plots ... and show both the way they come out if you use the basic histogram and plot commands, and then some options you can invoke to make them more detailed and perhaps more readable. Please keep in mind that what I show here just touches the surface ... and that there are many other options that you can use. What if we have a set of data and we make a basic histogram of the data, it might look like the followoing::

But, when you look at it, you decide that there are not enough tick marks along both the X and Y axes, that you would prefer to have a separate vertical bar for each score value ... and, you would like some grid like lines that go horizontally across the graph to make the heights of bars (that is, the frequencies) easier to interpret.

Now, with simple subcommands (and there are dialog boxes to do this too), we can change the number of bars, we can add more tick marks on X and/or Y, and we can put some grid lines that go horizontal ... all these are easily accomplished. Look at what I have done in terms of Minitab commands:

MTB>hist c2;
SUBC> midpoint 30:50;
SUBC>tick 1 30:50;
SUBC> tick 2 0:20;
SUBC> grid 2;
SUBC> color 4.

Now, the histogram command simply makes the histogram you see but, the subcommands make alterations to that. The subc 'midpoint' regulates the NUMBER OF VERTICAL BARS you see ... and by doing midpoint 30:50 ... it will force bars at each value from 30 to 50 in units of 1. But, bars are not tick marks along the baseline so, the subcommand tick 1 30:50 will make tick marks along the X axis (that is tick 1) from 30 to 50 in increments of 1. The subcommand tick 2 0:20 regulates the Y axis (tick 2) tick marks ... from 0 to 20 (if there had been a 20 for frequency) in increments again of 1. Finally, the grid 2 (Y axis) subcommand puts grid lines horizonatlly across the histogram graph at each tick mark ... and the additional subcommand color 4 change from the default color of black to a light blue. So, with all of this ... we get a revised and a more functional histogram that looks like:

Another illustration of modifying graphs to make them more in line with what you want is as follows. Look at the following scatterplot that I generated.

This shows a rather classic positive correlation pattern .... uphill on the plot between X and Y. But, you look at it and, wish there had been more tick marks along both X and Y. In addition, it happens to be that the scatterplot is made up of data from both males and females ... and that fact you have coded in some other column ... you might have coded males = 0 and females = 1. What you would like to do is to make a new plot, with grid lines to help in reading the coordinates of the data points, would like to make more tick marks on the X and Y axes ... but IN ADDITION, would like to make a distinction in the graph between the male and female data points. So, here is what we could do.

MTB> plot c17*c16;
SUBC> tick 1 20:80/5;
SUBC> tick 2 20:80/5;
SUBC> grid 1;
SUBC> grid 2;
SUBC> symb c30;
SUBC> color 2 4.

Similar to the histogram, the tick 1 and tick 2 change the tick mark locations on the X and Y axes ... and the /5 simply makes them go up in units of 5. Grid 1 and Grid 2 will put vertical and horizontal grid marks from the tick marks ... up or across the graph. Now, to get the data points for males and females differentiated ... we use a symb subcommand and I put c30 ... since that is where I had placed the codes of 0 for males and 1 for females in the worksheet ... so since there are 2 codes in c30 ... I then used another subcommand (subcommand to symb actually) for color ... and put 2 and 4 there so that the dots for males and females would be different colors.[NOTE: I decided to differentiate males and females with different colors but, you could have used OTHER colors ... or even different symbols if you had wanted to ... squares for males and circles for females for example.] Have a look.

So, we see that by using a few subcommands (and these are accessible via the pull down menus and dialog boxes ... though I think that way is more cumbersome), you can greatly enhance graphs. Here are the few that we looked at when making histograms and/or plots.

• tick1/tick 2 .... for altering the tick mark arrangements on X and/or Y axes
• grid 1/grid 2 ... for making vertical and/or horizontal lines on the graph at the tick marks
• midpoint ... for bar graphs, regulates the number of vertical bars
• symb .... allows many options for symbol types, colors, etc.

USING MINITAB TO WORK POWER PROBLEMS

One of the worst things in statisical instruction is discussing the concept of power. It seems to be confusing to all of us ... especially students. We say ... more power to you but ... it bounces off them like rain. Minitab does have some routines for working various power problems ... in single sample cases, in two sample cases, for proportions, and for a simple ANOVA case. I will concentrate on the simple 1 sample case using a z test ... where critical values are based on z values from a normal distribution (like 1.96 or 2.58).

Now, let's first look at the two general dialog boxes that are appropriate in this case:

Now, there are 3 possibilities here: A) finding power given 1 or more sample sizes, and a difference between the null and true parameter, B) for a fixed n, find the power values for various differences, and c) estimate n needed for 1 or more power values, given a difference between the null and true parameter. Note that in all of these, there is a place for entering the parameter sigma value.

Note there is an Options box ... if you click on this, you will see:

Here you can opt for 1 or 2 tail tests, change the Type I error rate, and even specifiy where you would like output stored. Unless you change alpha, the default value is .05 and a 2 tail test.

What if we want to see what happens to power if n changes ... from 10 to 100, in increments of 10 ... assuming that the null/true parameter difference is 3 and the population sigma = 10? Here is the dialog box you would see:

Note that you enter the different n values ... and the difference, and the sigma. When you click on OK ... you would see the following in the session window output.

1-Sample Z Test

Testing mean = null (versus not = null)

Calculating power for mean = null + 3

Alpha = 0.05  Sigma = 10

Sample

Size   Power

10  0.1578

20  0.2687

30  0.3759

40  0.4751

50  0.5641

60  0.6420

70  0.7088

80  0.7653

90  0.8122

100  0.8508

An old time graph would look like:

MTB > plot c2 c1

Plot

power   -

-                                                   *

-                                              *

0.75+                                         *

-                                    *

-                               *

-

-                          *

0.50+                     *

-

-                *

-

-

0.25+           *

-

-      *

-

+---------+---------+---------+---------+---------+------n

0        20        40        60        80       100

What if you were interested in how power would change as the distance between the null and true parameter values (from 0 to 5 in .5 increments) changed (with fixed sample size and sigma)? Here is the dialog box you would see if you input that information.

And if you click OK, you would get the following in the session window ...

MTB > Power;

SUBC>   ZOne;

SUBC>     Difference 0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5;

SUBC>     Sample 25;

SUBC>     Sigma 10.

Power and Sample Size

1-Sample Z Test

Testing mean = null (versus not = null)

Alpha = 0.05  Sigma = 10  Sample Size = 25

Difference   Power

0.0  0.0500

0.5  0.0572

1.0  0.0791

1.5  0.1165

2.0  0.1701

2.5  0.2395

3.0  0.3230

3.5  0.4170

4.0  0.5160

4.5  0.6141

5.0  0.7054

And you could make another graph ...

MTB > plot c4 c3

Plot

-

0.75+

-                                                   *

power1  -

-                                              *

-

0.50+                                         *

-

-                                    *

-

-                               *

0.25+                          *

-

-                     *

-           *    *

- *    *

0.00+

+---------+---------+---------+---------+---------+------diff

0.0       1.0       2.0       3.0       4.0       5.0

Now, what if you wanted to show what happens to power when alpha changes? In the Options dialog box, you could enter .01 rather than the default of .05 ... using the other information from the last example ... and then click OK twice ... once for the Options and then another time for the main dialog box. The session window output would be:

MTB > Power;

SUBC>   ZOne;

SUBC>     Difference 0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5;

SUBC>     Sample 25;

SUBC>     Sigma 10;

SUBC>     Alpha 0.01.

Power and Sample Size

1-Sample Z Test

Testing mean = null (versus not = null)

Alpha = 0.01  Sigma = 10  Sample Size = 25

Difference   Power

0.0  0.0100

0.5  0.0124

1.0  0.0200

1.5  0.0344

2.0  0.0577

2.5  0.0925

3.0  0.1410

3.5  0.2045

4.0  0.2824

4.5  0.3723

5.0  0.4698

Clearly, if you reduce alpha as I did in this case .... which effectively pushes the critical values out further, then power gets smaller .... of course, beta would go in the other direction.

Finally, what if the problem was to estimate or approximate what sample sizes you would need in this 1 sample z test situation to achieve some approximate power values (.1 to .9 in increments of .1)? Here is what the dialog box would look like filled in:

Keep in mind that with these power values, beta are the opposite ... and with alpha being .01 (I forgot to change it back to the default of .05) with sigma = 10 and a difference of 5 ... you see the estimated ns to achieve those power values. The output from the session window would be as follows.

MTB > Power;

SUBC>   ZOne;

SUBC>     Power .1 .2 .3 .4 .5 .6 .7 .8 .9 ;

SUBC>     Difference 5;

SUBC>     Sigma 10;

SUBC>     Alpha 0.01.

Power and Sample Size

1-Sample Z Test

Testing mean = null (versus not = null)

Calculating power for mean = null + 5

Alpha = 0.01  Sigma = 10

Sample   Target  Actual

Size    Power   Power

7   0.1000  0.1052

13   0.2000  0.2198

17   0.3000  0.3035

22   0.4000  0.4088

27   0.5000  0.5089

33   0.6000  0.6166

39   0.7000  0.7077

47   0.8000  0.8029

60   0.9000  0.9027

Clearly, if you don't want much power, then you don't need much of a sample size!

The nice thing about all of these cases, is that in Minitab, you could alter the conditions in the dialog box and then see what happens when you have a different set of conditions ... and being able to use the Options box to store results, you could create columns of values to compare later.

I hope you get the hang of how to let Minitab help you with these kinds of problems.