Case 17

Games Educators Play

EDITOR’S NOTE: This case was first published in Business Voyages:  Mental Maps, Scripts, Schemata, and Tools for Discovering and Co-Constructing Your Own Business Worlds in 2008.  I (Richard) wrote the case during the fall of 2003.

I posted this case here July 11, 2016 primarily to enable readers to gain some understanding of a metric I develped for measuring the productivity of teachers to increase the fairness of student evaluations used to measure the performance of teachers, a CITP, or Composite Indicator of Teaching Productivity, first published in our (Stapleton & Murkison) article, "Optimizing the Fairness of Student Evaluations:  A Study of Correlations Between Instructor Excellence, Study Production, Learning Production, and Expected Grades," published in 2001 in the Journal of Management Education by the Organizational Behavior Teaching Society.  The article has been cited by now in 2016 in 61 referred journal articles in several disciplines, from physics to psychology.  This case presents the context prompting the development of Optimizing the Fairness of Student Evaluastions and some of its major findings and implications.  I would have posted the article itself if I could.  Unfortunately it contains tables, charts, and graphs that would be impossible to format on this website.

This case focuses primarily on my use of transactional analysis concepts and techniques in my teaching career as a professor of business policy, entrepreneurship, organizational behavior, ethics, operations management and management information systems.  

Transactional analysis appealed to me in 1970 because it was readable, learnable, and teachable. Eric Berne, the founder of TA, was clear and concise in his books defining terms and concepts such as ego states, transactions, Games, scripts, time structuring, group cultures, group imagos, group etiquette, organizational boundaries, agitations, and cohesions (Berne, 1957, 1963, 1964, 1966, 1970). I was first exposed to his teaching in What Do You Say After You Say Hello, which I received in 1970 through the Book of the Month Club when I was teaching general management, my first year in full-time teaching, in a business school at the University of Southwestern Louisiana. I had just earned a doctor’s degree in business administration with a major in management science with minors in economics and finance in 1969 from Texas Tech University.  Eric Berne's most popular book, still selling today, selling over two million copies, is Games People Play (1964).

Despite the fact I had never had a business course as a student that entailed nothing but the analysis and discussion of cases for grades, I was teaching in a department that required all teachers to use the case method (Christensen & Hanson, 1986; Christensen, Garvin & Sweet, 1991; Christensen, 1992; Gragg, 1940; Towl, 1969), that required the reading and discussion of a case a day, with no standardized tests, the final grade being based solely on class participation and one case write-up used as the final exam.

I was told by Bernard Bienvenu (1969), a full professor and head of the department with a Harvard business doctorate, a native of the area, to just go in the classroom and talk with the students about the cases. Knowing what to say to these students in this situation after you said hello was not a simple problem. In a case of the blind leading the blind, not only did I have little understanding of how the case method worked, I had doubts about how well I could explain all those management science, economics, finance, accounting, marketing, and organizational behavior theories, procedures, and concepts I had read and been tested on in my doctoral studies.

I had good recommendations from my professors at Texas Tech and one of them, Carlton Whitehead, Coordinator of Graduate Studies in business, from Louisiana, had taught under Bernard for a year or so using the case method.  Carlton thought I would like the case method and South Louisiana culture, replete with good fishing and good seafood, especially crawfish etoufee.  Bernard told me he did not have enough money to even think about hiring Harvard doctorates to teach in his department, but he thought I would be able to learn the case method on the job, given my background and proclivities.  He offered me a decent salary and the associate professor rank, enabling me to skip the assistant professor rank, a feat almost unheard-of in those days.

I left the University of Southwestern Louisiana (now University of Louisiana-Lafayette) my first year, largely because of my feelings of inadequacy using the case method. I moved on to Georgia Southern College (now University) where I have remained 33 years and where no one used the case method, but me.  In a surprising and frustrating twist of fate, after I had already accepted my new position at Georgia Southern and had submitted my letter of resignation to USL, after it was too late to back out, during my second semester at USL it dawned on me how the case method worked. I became comfortable using the case method, I enjoyed talking with students using the case method, I felt successful using the case method, and I have used almost nothing but the case method in my business teaching since. This was one of the most significant if painful learning experiences of my life, probably the most significant academic learning experience.  To this day I am convinced business students learn more and better about business in general using the case method than they do memorizing theories and concepts for standardized tests.

I became a full professor at Georgia Southern at age 36, having published a book (Managing Creatively:  Action Learning in Action, University Press of America, 1976) containing cases and case method procedures, and a little transactional analysis, but I was a lone wolf or a voice in a case method wilderness throughout my career, missing good years of dialogues with my case method mentors at USL, the best business teachers I have encountered. Probably less than 5 percent of all business teachers worldwide use the case method, and fewer still use transactional analysis. Shortly before I left USL, Bernard Bienvenu signed me a copy of his book New Priorities in Training (1969), published by the American Management Association, and included this inscription, “Good wishes for the gift of perseverance which leads to accomplishment.”

Accidental Causes of Learning

Fate, blind luck, evolution, or historical accidents cause us to be exposed to the most significant teaching and learning in our lives—especially the teaching we received from our parents in our earliest years. Not only do we learn how to walk, talk, and squawk in this school, we learn about how to feel, think, and do for good or ill throughout our lives by what we decide about the winner, loser, or non-winner scripts that are passed down to us from our ancestors.

What else we are taught and learn is also largely a matter of luck or fate, in my opinion, depending on what is taught in the media, religious institutions, communities, schools and universities, and culture on the particular terra firma spot on which we are accidentally born and reared. Although few would argue reading, writing, and arithmetic should not be taught to all children, even teaching such as this is not yet universal, not taught at all among isolated indigenous tribes, and, while taught, is not learned by many students even in advanced societies. Even if these basic skills are taught to all children in compulsory schools, if a student is accidentally exposed to a poor-enough teacher or school she or he may not learn one or more of these skills because of the inadequate teaching, and, if a student has already been accidentally taught by parents in a powerful enough way not to think, learn, or be successful such a student may not learn how to read, write, or do arithmetic even if exposed to the best of teachers and schools. Adding more unfairness and uncertainty to this cosmic learning lottery, some students apparently inherit genes through biological processes that predispose them to be good learners of various skills, otherwise how could people such as Benjamin Franklin, Abraham Lincoln, Frederick Douglas, and Thomas Edison have learned and accomplished what they did with less than 6th grade educations?

Had I not encountered Vince Luchsinger a professor of management at Texas Tech who told me in 1966 I could work on a doctorate at Texas Tech I would never have become a university professor and for sure I would not have learned what I have learned since then. Had I not encountered Carlton Whitehead the coordinator of graduate studies in business at Texas Tech who told me about the USL case method position, I would never have learned how to use the case method. Had I not accidentally received Eric Berne’s What Do You Say After You Say Hello through the Book of the Month Club, I might never have ever learned anything about TA, and I would not have written this case. Because of reading this book and because of previous books I had read and experiences I had involving my own script, I was able to understand how TA concepts could be used to deal with problems I was experiencing as a teacher. One’s lifetime education, therefore, largely depends on the people and reading matter one accidentally stumbles across, assuming one learns to read in the first place.

Learning To Teach TA

It seems to me teachers in TA certification programs for educators have generally been taught some TA beyond what they might have learned on their own reading transactional analysis books and articles and were left to their own devices to figure out how to use TA in whatever kind of teaching they might do. One might think it might help for teachers certified in education by TA associations to have a degree or two from a college of education, since some good research and literature have been published in this field. I have read some of this literature (Dewey, 1935; Dressel, 1961; Freire, 1970; Gagne, 1977; Giroux, 1992), but most of us in higher education, outside schools and colleges of education, have had little or no teaching or training about how to teach by experts in the field of education. We may be relative experts in our fields, such as business, finance, physics, or whatever, but anything we know about teaching we have generally picked up by copying the teachers who taught us. I am convinced most teachers are scripted by the teachers who taught them in their undergraduate and graduate courses in their fields. Teachers generally copy or introject the values, attitudes, group imagoes, ego states, transactions, and teaching methods exhibited and used in the classroom by their favorite or most influential teachers, which they then pass on to their own students. I am the only teacher I have heard of who adopted a teaching method used throughout his teaching career that was not used by his undergraduate and graduate teachers. Based on my observations teachers almost never change their teaching scripts throughout a teaching career.

I encountered some of the brightest and best teachers and fellow students I have experienced as a student learning transactional analysis at the Southeast Institute at Chapel Hill, North Carolina during 1975-1980—Martin Groder (1977), Vann Joines (1987), Graham Barnes (1994), Ken Ernst (1972), John O’Hearn, Ken Sowers, Pam Dickson-White, Shep Gellert (1983), Buzz Lee, Nick Moore, Grady Hough, Tim Schnabel, Russ Osnes (1974), Pamela Navarro, Pam Jelly, Jake Jacobs (1991) and many others. All of these people were adults with degrees in several fields, including medical doctors, psychologists, a registered nurse who taught in a medical school, an elementary school teacher, two university professors of mathematics, a CPA, a minister, an ex-Army chaplain, an ex-college president, and others. We discussed complex, vexing, sometimes insidious problems, yet we had some fun, and most of us would try to give anyone a straight answer about any problem. While there was probably some Game-playing going on even here, there was far less of it than I have experienced in other schools. I was amazed and encouraged by what Ken Ernst (1972) was teaching his grade school students in California, and I hoped he might become a role model for public schools everywhere. He was teaching students in a public grade school about ego states, transactions, Games, and scripts in ways they could understand and was giving them tools for dealing with horrific psychological problems some of them had experienced in their lives at school and at home. Unfortunately he may have been one of a kind, since he is the only teacher I have known who did this kind of teaching in a public grade school.  Almost all my teaching experience has been teaching business students in colleges and universities who were 18 or more years old and adults of various ages in continuing education and consulting programs. While I have innovated some techniques and procedures I consider effective in my classes, I have had no firsthand experience indicating they work in classes at all grade levels.

In 1980 I thought about teaching TA to all kinds of teachers anywhere. I had 1,000 coated, folded 4-page brochures printed in two colors that I mailed to curriculum directors throughout the United States, which resulted in the sale of one contract, six hours worth of didactic teaching in one day for a county school system, in a Southern state other than Georgia, as in-service training for all teachers in the system before school started. I think I was paid $50 an hour. The curriculum director said she hoped I was good because the school system would have her head if the teaching bombed since it cost so much. While the teaching did not bomb as such, I was, in my opinion, less than successful, and I did not mail out any more brochures to school systems pell-mell. Some of the teachers in this experience told me personally they benefited from my TA teaching, but it was obvious many did not like what they heard or me personally, and I did not like what I saw in some of them. Here I was teaching supposedly-mature professional teachers about how to motivate students and create positive attitudes when some of these teachers had worse attitudes and were more immature than were some of my college students. Some of them needed therapy, and there was no way I could deal with these issues in this setting, ethically or otherwise. One attractive teacher in her forties became upset when I suggested that they might randomly select students to talk about homework in class. Accosting me before a group of 100 or more teachers, she said, “I would never dare do that with my students. It would embarrass them to death.”

I had been conducting some night TA courses through continuing education at Georgia Southern for adults in my community from any profession or walk of life that were successful in my opinion. The difference was the teaching method. In the six-hour contract with 200 teachers I had to generally fill up the airtime by lecturing, defining terms, and giving my own opinions. In my continuing education courses I had passed out TA material I discussed in groups of 30 or fewer students as students leisurely applied the material to their own situations. Unfortunately, by 1985 or so few students were signing up for these courses. Either I had taught by then almost everyone in my community who wanted to learn about TA or almost everyone around here had decided TA did not work. Or maybe everyone had heard about me as a TA teacher and did not enroll because of my persona and proclivities. Regardless, I have done very little continuing education TA teaching since 1985. I have, however, continued to use TA in my teaching of undergraduate business students in my business policy and entrepreneurship courses.

I also conducted some successful TA training programs with some businesses in this area during the late 70s and early 80s. I worked with one local manufacturing plant off and on for 5 years, teaching TA to about 300 employees at all levels in didactic sessions and working with them in group meetings discussing plant problems. I worked with them on contracts involving not only interpersonal conflicts but also such things as purchasing new computers, inventing new information systems, and changing the scheduling system of the plant. Unfortunately this plant has since been sold twice and most of its jobs have been moved to a low-wage country.

De-Gaming Teaching and Learning

When I was teaching at the University of Southwestern Louisiana, Rex Hauser, one of the two Harvard Business School doctorates teaching in the department, would randomly select students to start case method discussions by picking a card from a deck of computer data cards he shuffled in front of the class every day. These cards were turned in by students the first day of class to prove they had enrolled in the course. Whoever’s card got picked had to tell the class what he or she thought was the problem in the case and what to do about it. I had never seen or heard of such a thing. Part of my learning at USL entailed learning that this random-selection process worked. It became obvious to me that Rex’s students read the cases before they came to class and were learning more than my students were learning because of this random-selection process. While I never had the confidence to try this data card-shuffling idea at USL, after I got to Georgia Southern after a year or two I decided to have my students position their desks around the perimeter of the classroom and play spin the bottle.

The coke bottle would roll and rattle around on the floor, creating a distraction, so I started spinning a pen on my seating chart, which was also circular reflecting the circle classroom layout of the class. The person whose name the point of the pen stopped on became the accidental discussion starter of the day. This solved the rolling- around problem of the bottle, but since I sat in a desk just like those students sat in, which had a slanted top, my pen would often roll off the desktop before it stopped spinning.

By this time, I had learned about psychological Games and the Karpman Drama Triangle (1968) at the Southeast Institute, so I concocted a circular piece of wood 9" in diameter to which I affixed a spinner, also made of wood, which I labeled the Classroom De-Gamer, which I described in an article I sent to the Transactional Analysis Journal, which was accepted and published under the title “The Classroom De-Gamer” (Stapleton, 1979a). I drew concentric circles of numbers on the De-Gamer, corresponding to different class sizes. Each student in a class would have a permanent number for the course corresponding to where he or she had decided to sit in the circle the first day, numbers being assigned in sequence around the room starting with any student I might pick. Students sitting near me in the circle day in and day out could see that the spinner stopped on particular numbers, and they would know I was not playing Games when I called out the number for someone to start the discussion and prove they had fulfilled their contract for the day by reading the case before class. Since I sat in different desks on different days different students would watch the spinner spin.

I created my own publishing company in 1979, Effective Learning Publications, and printed 500 copies of a book I wrote titled De-Gaming Teaching and Learning: How to Motivate Learners and Invite Okness (Stapleton, 1979b). The book was built around the Classroom De-Gamer, describing how it tended to De-Game Games such as NIGYSOB, KICK ME, GEE, YOU’RE WONDERFUL PROFESSOR, POOR ME, DO ME SOMETHING, COPS AND ROBBERS, I’M ONLY TRYING TO HELP YOU and ISN’T EDUCATION WONDERFUL. I pointed out no student could logically feel or think I was acting out the Persecutor or Rescuer role when she or he was selected by the De-Gamer to start discussions, and no student could logically think he or she was a picked-on Victim when selected. I am sure much of my motivation for writing this book stemmed from the disgust I felt as a student in grade school and high school in classes taught by teachers who had teacher’s pets.

I am indebted to Martin Groder, MD at Chapel Hill, North Carolina for his help writing this book, not only for his powerful TA ideas published and cited in the book, but also for his insights and feedback regarding the overall problem of de-gaming Games.  Marty gave me a great stroke for the book, which I used as a blurb marketing the book, with his permission, which I am convinced significantly increased its sales, primarily through Trans-Pubs, the ITAA bookstore, that marketed TA books internationally through direct mail and in bookstores.  Trans-Pubs set up traveling bookstores at ITTA conferences in various locations around Earth.  

Marty said, "De-Gaming Teaching and Learning is a major new application of transactional analysis."

This short blurb was a high compliment, coming from Martin Groder, printed with the book listing in all Trans-Pubs catalogs and mail-outs.  Marty was a major leader of TA at the time.  He was a protege of Eric Berne, having worked with Berne in San Francisco as a trainee, and was a leading contender for president of the International Transactional Analysis Association when De-Gaming Teaching and Learning was published in 1979.  The ITAA at that time was a vibrant and growing inter-disciplinary professional association with about ten thousand international members. 

In De-Gaming Teaching and Learning I described in 132 pages how I thought ego states (Berne, 1970), transactions (Berne, 1970), ego-grams (Dusay, 1972), scripts (Goulding & Goulding, 1976; Steiner, 1974), time structuring (Berne, 1970), strokes (McKenna, 1974; Steiner, 1971), Games (Berne, 1964; Karpman, 1968), rackets (English, 1971, 1972), mini-scripts (Kahler, 1974), and OKness (Ernst, F., 1971; Groder, 1977) related to classroom situations. I made recommendations in the book, such as teachers using appropriate ego states in the classroom—generally more Adult, Nurturing Parent, and Free Child, but yet some positive Critical Parent and Adapted Compliant Child to maintain discipline and productivity; encouraging, for college students at least, more Adult—Adult transactions than others in the classroom; giving permission to students to overcome detrimental drivers and injunctions; allowing students where appropriate to express real feelings; allowing and encouraging students to think, learn, and contribute in class at the highest level of which they are capable; and managing the class in such a way as to maximize Okness among all participants. I used TA to explain how I thought classrooms should be designed and built, arguing that circle classroom layouts with the teacher sitting in a desk or chair just like everyone else tended to create a more I’m Ok—You’re Ok, Adult— Adult culture, as did amphitheater classrooms; whereas classrooms with students sitting in desks in fixed rows and columns with the teacher standing behind a lectern or sitting behind a big desk tended to create a more Parent—Child, I’m Ok—You’re Not Ok culture. I also recommended that teachers develop clear contracts with students at the outset of a course through course syllabi, a major feature of the contract being that students would read and do homework before class and share and discuss ideas about it in class.

I printed up some more coated two-color brochures offering for sale De-Gaming Teaching and Learning and mailed them out nationally and internationally to names I had purchased from a mailing list broker. Each mail-out would sell a few books, generating about a one-half of one percent response rate, which would not cover the cost of the printing and mailing of the brochures. I sold 300 or so copies through Trans-Pubs, and sold all the 500 copies in the first printing over the next few years. The book is still listed in Books in Print and is still available through Effective Learning Publications, although no orders have been received in 6 years. 

(Postscript:  I updated De-Gaming Teaching and Learning in 2016, retitleing it Born to Learn: A Transactional Analysis of Human Learning, publishing it through Effective Learning Publications.  It's not a best seller yet but it has received some good reviews and book royalties are trickling in.  You can find the reviews by typing Born to Learn:  A Transactional Analysis of Human Learning into Google.  You will find several results).

In the late ’70s and early ’80s transactional analysis concepts were sometimes included in standard textbooks published by major publishers in business disciplines such as organizational behavior and business policy, which I taught and still teach. TA concepts are no longer there in new editions of those books. I have not seen any mention of TA in any business book, except mine, in several years, and almost no one ever mentions TA to me anymore, including people in the community whom I taught TA. One exception is Jack Mallard, a successful retired businessman, who has asked me several times in recent years around town, “Do you remember that TA class we had back in the ’70s?”

I have written a 79-page monograph I call Business Voyages (Stapleton, 1998), printed by Georgia Southern Printing Services and sold to students by the university bookstore, paying no royalties, that I use in my Applied Small Business course, a course that entails students consulting with and writing cases about small businesses in the community, involving about 30 students per year. This monograph has a chapter explaining how I think TA concepts and procedures relate to entrepreneurs and small business. In written feedback about the course students often say the TA material was among the most interesting parts of the course. It seems none of my business students in recent years have heard of TA before they read Business Voyages.

The metaphor of Business Voyages is that business people sail in a boat with others, whether in companies or countries, for the short time of their lives, in rivers, lakes, and oceans. Inertia created by scripts and currents and winds in the environment inexorably sweep everyone somewhere, but it is possible for individuals to chart courses and reach ports of call on Earth they might personally prefer.

As I pointed out in my original Classroom De-Gamer article (1979), I am still convinced the Classroom De-Gamer makes my business classes more Adult, fun and productive than they otherwise would have been and that the De-Gamer reduces Game-playing in my classrooms.

Although I have never had problems attracting my share of good students for my undergraduate or graduate classes at Georgia Southern, some students have always shied away from my courses because of the Classroom De-Gamer, which is generally referred to by students as “The Spinner”, although it has also been called “The Death Wheel”, among other things. Several years ago I became convinced that teachers might be more inclined to use De-Gamers if the device were called something else. I had a new batch of 15 made by a local woodworker and had the name “Wheel of Fate” printed on them as a trademark. While some students seem to like this name better, apparently most still call it “The Spinner”. Teachers still do not like the idea. I sold 5 of the new batch and 10 are stored in my attic. They are made to last from sturdy oak wood.


Teaching generally entails teachers tasked to teach students the content and skills of subjects specified by a school curriculum. These subjects range from reading in the first grade to calculus in high school with all subjects in between—geography, English, plane geometry, history, chemistry, physics, psychology, vocational agriculture, whatever. In most cases, textbooks are specified by a state agency, but in some cases they are not, especially in colleges and universities with academic freedom in which professors may select or write books they require students to read, as have I (Stapleton, 1976, 1985, 1998, 2003). A given for a teacher is to cause students to learn the content or skills mandated by the curriculum regardless of the books or methods he or she might adopt (Gagne, 1970).

Most of my students have performed teaching functions since I have delegated them the freedom and responsibility to write cases they publish to their classes. They educate one another when their cases are discussed in class, which usually consumes about 25 percent of the airtime of my courses. Most of the books I have adopted for my courses have been written by business professors at Stanford (Collins & Lazier, 1995) and Harvard (Stevenson, Roberts, and Grousbeck, 1999). An exception is a business policy and strategy book I have used for about 30 years written by professors at the University of Alabama (Thompson & Strickland, 2003).

According to accounts I have read and seen in various media and conversations I have had with people familiar with what goes on in public schools, it appears some public school teachers are more or less forced to be more concerned with discipline problems and keeping order than with causing learning to occur. I sympathize with these teachers and respect and admire the abilities and efforts required to deal with problem students in these situations. While I think knowing TA concepts and techniques can help them deal with these situations, I do not think TA is a panacea. I have been fortunate since most students where I am do not create discipline problems. I have experienced no violent behavior in my classes, although in recent years I have had increasing problems with anti-social behavior such as students coming in class late and gossiping in class when responsible students are discussing the assigned reading. Because of this phenomenon, 5 years ago I wrote laws in my syllabi that students must agree to as a contract at the beginning of the course that this sort of behavior is outlawed, specifying they will lose letter grades if they violate the law, which has happened in several cases. Unfortunately I suspect college and university teachers will have more problems with discipline and anti- social behavior in future years than I have experienced in my career. This past summer I taught a beginning business course in which there were 4 or 5 students in the class of 40 who obviously had experience with gangs in high schools. Up until the past year or so I was unaware that gangs existed in this area. While I recognize that violence (Gilligan, 1997) and anti-social behavior are serious problems in many schools, my emphasis in this article is on how to produce relevant learning and De-Game students, teachers, and administrators.

Designing the Course

It seems to me most teachers including grade and high school teachers have some freedom to decide how to cause learning to occur, and there are two basic strategies for making this happen, in my opinion: (A) the teacher can do most of the talking in class and basically tell students what to remember for tests, or (B) the teacher can require students to do most of the talking in class by discussing and applying what is in the books, cases, and other reading.

Most teachers adopt strategy A. The teacher fills up the airtime talking in class, thereby seeming to be earning his or her pay. Students do not have to do homework day in and day out and are relatively easy to control since all they are generally supposed to do in class is stay awake and pay attention to the teacher and remember what the teacher tells them for tests.

Strategy B requires students to read homework and participate in class. Students have to think about what to say in class. They exercise their personalities and sometimes argue with one another in class. The teacher may say very little. Students are harder to control and may waste the time of the class saying things that will not show up on tests. Some observers may think the teacher is too lazy to teach.

As you might expect, I think strategy B causes more and better learning to occur. For one thing it gives students practice reading empirical and intellectual material, but more than that it requires students to think for themselves about what is relevant (Bateson, 1972; Fuller, 1969) and communicate dialogically and dialectically in class with fellow students about significant issues (Abercrombie, 1960; Buber, 1966, 1969; Habermas, 1981). With a little supplementing and explaining on the part of a qualified teacher most students can understand and remember much of the content specified by the curriculum—plus they will develop valuable work habits and communication skills.

A major social problem with the latter approach is that it becomes obvious as the course unfolds who are the better students, and if you grade based on class participation, rather than true-false, multiple- choice or other so-called objective test questions you might have spoon-fed answers to before tests, it becomes obvious who are relatively the excellent, good, average, and poor learners and contributors. If one is to be fair grading, most of the grades will be Cs, since on a 4-point system C=average, and there will be relatively few As, since A=excellent. Since grades are based largely on class participation, and since the students in the class also had their eyes and ears open, they too know who were the excellent, good, average, and poor contributors, and since they are believers in the equity theory of management just like everyone else, that is, that a worker, student, or contributor to anything should be rewarded the same as others who contributed the same, the teacher must give each student what he or she actually deserves, since students quite naturally will compare their final grades after the course is over.

I have been criticized because I have not used so-called objective tests with true-false, multiple-choice, fill-in-the-blank, or short-answer questions to prove students have learned. I strongly disagree with this criticism. In the first place, does memorizing answers for such tests prove relevant learning occurred, especially when one considers that answers are often spoon-fed, that test questions can be calibrated downward in difficulty to make sure all students can make A’s and B’s, and, if this does not work, that raw scores can be curved upward to make sure all students receive A’s and B’s.

Almost all my grades have been based 80 percent on class participation and 20 percent on case write-ups and long essays. I have not required students to memorize specific points or derive exact answers for mathematical problems. After students have read the text material and cases (most of which contain numbers and require mathematical reasoning) before class and have discussed it in class, especially while being subject to the draconian penalty of losing a letter grade if caught unprepared by the Classroom De-Gamer (Stapleton & Stapleton, 1998), a good bit of the material will stick with them, probably more than would have been the case had I attempted to tell them everything they needed to know for tests (Stapleton & Stapleton, 1998). Most of my students agree with me on this. Many say they forget everything they memorized for tests two weeks after the course is over.

A primary gain from using multiple-choice, true-false and other so-called objective tests as sole determinants of final grades is creating an illusion of objectivity, achievement, and fairness for students, teachers, administrators and parents. A secondary gain is absolving students, teachers, administrators and parents from the responsibility of learning in daily class discussions how much some students know about the subject and how little others know.

I ultimately agreed with my colleagues in Louisiana who had been trained in the Harvard Business School, probably the most successful business school in the world (Forbes, 2003), that the purpose of university business teaching is not to teach students to learn how to understand and remember theories, opinions, models, or procedures, or solve algorithmic puzzles but to learn how to understand business situations as they exist, make good decisions, create workable policies and strategies for dealing with real situations, and convince others that one’s ideas should be implemented. I also agree with other colleagues that most schools are not business schools, and their purposes are different. The purpose of most schools is to teach students how to understand and remember geographic and historical facts, theories, concepts, mathematical procedures, and the like. While my colleagues in colleges of education know much more than I about the accepted theory and techniques of testing and proving whether learning has occurred (Dressel, 1961), it is obvious that passing tests does not prove whether one will be able to understand business situations as they are encountered in real life, when the stakes are high, otherwise some of my colleagues with PhDs in business would not have lost half their retirement savings in 2000 in the stock market.


Grade inflation—teachers giving students grades higher than they deserve—which entails teachers playing Games such as GEE, YOU’RE WONDERFUL STUDENTS, I’M ONLY TRYING TO HELP YOU, and ISN’T EDUCATION WONDERFUL, is rampant, not only in colleges and universities, but also in public schools (Healy, 2001; Johnson, 2002; The Economist, 2001, 2002). According to The Economist (2002), notable exceptions to the problem of grade inflation internationally are graduate schools such as the Harvard Business School “who have manfully maintained a rigorous grading curve.”

I have been engaged in research dealing with this problem for 7 years. I invented a metric in 1997 I call the Composite Indicator of Teaching Productivity, or CITP, which entails estimating the success of a teacher based on an average of ranks for four variables—Instructor Excellence, Study Production, Learning Production, and Relative Expected Grades—taken from student responses on student evaluations of teachers. Unfortunately, while the CITP has been published in the Journal of Management Education (Stapleton & Murkison, 2001), and I have emailed the article to all administrators and teachers in my college (Stapleton, 1978), some 90 people, as well as the provost of the university, I have seen little evidence the CITP has been significantly used in our college. The CITP forces administrators to take into account how much teachers motivate students to study and the grades teachers have led students to expect, not just how well teachers pleased students as evidenced by instructor excellence scores.

After emailing the CITP article to all teachers and administrators in our college, I received two responses. One administrator told me in confidence he agreed with me about the pernicious effects of expected grades and that he thought it would be easy to program a computer to compute CITPs every semester for every faculty member. The other response came from a faculty member who told me he basically agreed with the CITP concept but most faculty were threatened by it and were upset that I had sent them the article in an email attachment. He said he told them that no one had forced them to open the attachment and read the article, but he also said he thought it was more or less traditional in academia that faculty members should not try to force other faculty to read and use research.

Despite my success five years ago causing study production, learning production, and expected grades questions to be added to the student evaluation form used campus-wide at our university (Stapleton & Murkison, 2001), administrators generally persist in only paying attention to instructor excellence, which shows how well the teacher pleased students, despite the fact my research (Stapleton & Murkison, 2001, Stapleton & Stapleton, 2003a, 2003b) and the research of others (Chacko,1983; Greenwald & Gillmore,1997a, 1997b) has clearly shown that instructor excellence scores are significantly affected by relative expected grades—the grades students expect in a course relative to the grades they normally expect in a course for a given amount of work—and the higher the relative expected grades the higher the instructor excellence score, and vice versa. These are cases, in TA Game words, of students playing GEE YOU’RE WONDERFUL PROFESSOR when their relative expected grades are high and NIGYSOB (Now I’ve Got You, You S.O.B.) when their relative expected grades are low.

In my opinion administrators persist in counting only instructor excellence and discounting study production and expected grades, despite research indicating how Game-infested this is, because this increases their own evaluations and raises as administrators. Counting only instructor excellence makes most teachers look better than they otherwise would, and since teachers also evaluate administrators, administrators must please most of the teachers in their units to receive high administrator evaluation scores.

Such behavior contributes to maintaining the moral equilibrium (Baier, 1958) of the system, morality being doing what is thought to be right based on abstract principles such as honesty and truth and agency theory (Goldman, 1970) such as doing what will advance your own interests and/or the interests of others. Although it seems to me abstract principles such as honesty and truth should supersede acting in a self-interested manner in schools, it seems this is often not the case. Unfortunately there are no criteria for grading students, teachers, and administrators in a school that will advance the interests of all others. It may be that in most school systems challenging teachers who honestly grade based on relative academic achievement are sacrificed, since the utilitarian criterion of justice (Rawls, 1999), i.e. attempting to create the greatest happiness for the greatest number, may be optimal (Stapleton & Murkison, 2001). Students, after all, really do need high grades to get scholarships and job offers, and teachers and administrators need high merit raises in order to be able to save money to send their own children to college, and so forth.

Teachers playing ISN’T EDUCATION WONDERFUL by lowering their standards to create high expected grades generally win—because it only takes a small percentage of students in a class taking revenge with NIGYSOB to destroy the departmental ranking of a teacher with high study production and low relative expected grades scores if only instructor excellence is counted in the process of evaluating teachers.

Administrators in my department up to the present time (March 2004) have ranked faculty based on instructor excellence scores and have discounted (Schiff & Schiff, 1971) study production and relative expected grades scores. I usually rank in the bottom 25 percent of these instructor excellence ranks. After my study production and learning production questions and the administration’s absolute expected grade question were added to the campus-wide form three years ago, administrators in our college stopped furnishing teachers scores for all questions on the student evaluation form. By making a special request I have been able to secure this data, and I know my study production score has been high and my expected grades have been low (Stapleton & Stapleton, 2003a, 2003b).

Despite the Game-playing student evaluations create, student evaluations in colleges and universities do more good than harm in my opinion, primarily because they show what all students in a class thought about the teaching of a teacher. In the absence of student evaluation data teachers are at the mercy of administrative opinion picked up in hallways and offices based on small samples of student opinion, which are more contaminated and Game-infested than student evaluation data, since most administrators have been taught they do not have time to actually see what is going on in the classrooms of teachers.

I have no firsthand experience with teacher evaluation systems in public schools, but as I understand the situation, since students in public schools are presumed to be too young and immature to evaluate teachers, student evaluation forms are not used; administrators sometimes visit classes to observe teachers teaching; there is more peer review; and student test scores on standardized tests are sometimes used to judge how well a teacher taught. In my college we have had almost no observing and peer review, and student test scores are almost never used to prove anything. About the only thing that counts in teaching evaluations is the instructor excellence score produced by student evaluations.

In at least one sense the problem of teacher evaluation is less significant in public schools than in colleges and universities, since most public schools do not have merit raises, salary raises being based instead on degrees earned and seniority. In colleges such as our business college there are merit raises, and differences in merit raises among teachers year in and year out can cumulate to significant yearly salary differentials over time. Therefore Game-playing around teacher evaluations is probably generally heavier in colleges and universities than in public schools. If the merit raise is based on receiving relatively high scores for instructor excellence, one can see why grade inflation is rampant in colleges and universities that have merit raises, and since using a Classroom De-Gamer increases one’s study production score which reduces one’s instructor excellence score, it is easy to see why most college and university teachers would never dream of using a Classroom De-Gamer.

Managing Colleges and Universities

The grade inflation issue has significant implications for the management of colleges and universities. If professors who assign the highest grades also receive the highest student evaluations and the highest yearly merit salary increases then the amount and quality of learning among students will almost surely decline and deteriorate.

Although in overall perspective I think the University System of Georgia has adequately administered Georgia’s colleges and universities, I have been critical of the fact that most administrators in the business school at Georgia Southern did little teaching and accomplished little or no research and yet generally received some 10 percent or more in salary than professors. They did useful work—overseeing records and paperwork, conducting and attending meetings, talking with students and alumni, using student evaluations to evaluate faculty for salary raises—but this is not something you need a PhD to do. Although it may be that most state university systems automatically pay administrators more than teachers, it seems to me this policy is flawed and invalid. It seems to me administrative work is not inherently more difficult or more valuable than teaching and research, and probably the contrary is true. To verify that administrators in the business school have in general received some 10 percent or more in salary than professors all one has to do is read the data published yearly in the State of Georgia Auditor’s Reports in the Georgia Southern library or in other libraries around the state.

According to Mieczkowski (1991, 1995), college and university administrators are often not subjected to meaningful evaluations because few people are in a position to know what they do or have any idea how well they do what they are supposed to do. Colleges and universities have the same problem large corporations have enforcing effectiveness, efficiency, and productivity at the top. Most board members are unable to understand the details, subtleties, and intricacies of operational matters yet they enjoy the prestige of being on the board. Consequently they tend to rubber stamp whatever they discuss. Many executives and administrators tell their board members they need more compensation because their counterparts in similar organizations are being given that much compensation and they might leave if they do not receive as much. The process is somewhat analogous to children extracting more allowance money from parents, by telling their parents that Johnny or Sally gets this much allowance and they should too, since they have been just as good as Johnny or Sally.

In general college and university professors use the same strategy, but at least in their cases they can point to their personal research and publication results, the numbers of courses and students they have taught, the numbers and types of courses or programs they have created, and evidence from former students that what they taught was valuable to demonstrate productivity relative to peers. It is often difficult for an administrator to point to something specific that s/he personally accomplished that demonstrates productivity relative to peer administrators. Most college administrators make little effort to accurately learn and measure how relatively well professors teach truth and produce learning in classrooms; instead they focus primarily on pleasing their superiors and rely on students and student evaluations to grade the teaching of professors.

As a participant in the unfair race in which all humans compete, I have no right to complain compared with most people in most walks of life, organizations, and countries about my career income at Georgia Southern, especially considering the fact I did not decide to become a college professor in the first place to get rich. In overall perspective I have been relatively well compensated compared to people generally and I have enjoyed more freedom, job security, and opportunity for self-actualization and travel in my work than most people. Yet it surprised me to see printed in a State Auditor’s report in the Georgia Southern library a year or so before I retired that quite a few administrators and professors in the business school at Georgia Southern had higher salaries than I, who in my opinion had contributed less of significance to Georgia Southern and to their academic fields. I was the top scorer and playmaker on every junior high, high school, and college basketball team I ever played on, and I do not like to lose, especially because of poor rules, referees, and scorekeepers.

In our "Optimizing the Fairness of Student Evaluations" study, when Instructor Excellence scores alone were ranked, I ranked 23rd of 25 faculty members in the study; based on a summary CITP, Composite Indicator of Teaching Productivity, ranking, I ranked 3rd of 25 faculty members in the study.

Like most people I believe in the equity theory of management, that is, that people doing the same thing should be rewarded and compensated commensurately to what they produce. Unfortunately in reality I wonder how widespread the equity theory of management is applied. I suspect in most organizations if they published salaries in auditor’s reports for all employees that many employees would wonder why many salaries are as they are. In many cases employees, teachers, professors, administrators, managers, CEOs, etc. are rewarded not for producing meritorious contributions of value to students, customers, stockholders, and humanity but for conforming to the organizational script and for pleasing the personalities of supposed superiors (Stapleton, 1978). The preacher in Ecclesiastes in the Bible was right about races not necessarily being won by the swift.

To cure the problem of salary inequities between administrators and professors all a system has to do is create a policy that does away with career administrators in academic units that provides for the random selection or selection in rotation of temporary administrators from the ranks of tenured faculty with terminal degrees for no more than three-year terms, giving them reduced teaching loads for three years but evaluating them using the same criteria that are used for fulltime professors. There is no need for a permanent class of career administrators in the academic units of colleges and universities. Anyone responsible enough to earn a PhD is responsible enough to serve as a temporary administrator in an academic college or department for three years. Being drafted to serve as a temporary administrator for a professor would be like a US citizen being drafted to serve a three-year term in state or federal government. Randomly selecting temporary administrators will help De-Game (Stapleton, 1979b) the administrative process by reducing pandering, flattering, favoritism, and cronyism.

We had some excellent administrators at Georgia Southern from the president’s office on down whom I thought conscientiously did their best to advance the interests of the university and their academic units, and we had no administrators at Georgia Southern that I knew that I considered truly incompetent or corrupt, and I thought most of them did their jobs adequately given their requirements and constraints. I think most of them did what they thought was right taking everything into account.

We had numerous professors in the academic departments, schools, and colleges of Georgia Southern whom I thought were exemplary scholars, researchers, writers, teachers, and human beings—who did their best to advance the interests of their students and mankind.

I think most administrators and professors played some psychological games, being forced to for survival in some cases; but the major problem, in my opinion, was that university system precedents and processes caused a chronic general overcompensating of administrators relative to professors. This problem is apparently common in school and university systems. Colleagues at professional meetings throughout the country told me they thought university systems in most states bestowed higher incomes on administrators who did relatively little or no scholarly or intellectual work who had somehow nevertheless been ordained as superior to professors. How this metamorphosis from professor to administrator comes about remains a mystery. For a definition and discussion of psychological games see Chapter 3, People Along the Way, in Business Voyages.

On the other hand, a major cause of salary inequities in AACSB business schools is the market for new PhDs. If new PhDs in a business field are scarce and you have to hire them to maintain AACSB standards—the AACSB among other things requires for accreditation that 40 percent or more of business faculty in all business disciplines in a school must have a PhD, DBA, or other qualified terminal degree—and if your school is competing with various states and several universities you may have to offer new PhDs with slim or no track records competitive salaries determined by current supply and demand in the market for new PhDs, which may be higher than the salaries of your long-service productive professors with good or excellent proven track records, and if the system thinks administrators should be paid as much as or more than the highest paid new PhDs, and if merit raises are percentage-based in subsequent years, salaries become increasingly inequitable through time. This problem is known as “compression” and probably most AACSB (American Association of Collegiate Schools of Business) business schools have a problem keeping their faculty and administrative salaries equitable over time because of compression.

Getting and keeping AACSB accreditation for over 30 years, as we managed to accomplish in our business school at Georgia Southern during my tenure, was not easy. Only about 20 percent of 2,000 or so US colleges and universities offering business courses during this period achieved AACSB accreditation. We were one of four of twenty or so four-year colleges and universities in the University System of Georgia—Georgia Southern, University of Georgia, Georgia Tech, and Georgia State—to maintain AACSB standards during most of this period.

Last year I asked one of my classes why so many students dropped after the first day. A student blurted out “You only give 11 percent A’s,” which amazed me since that was about right for that course. I asked him how he knew that. He said it’s on Answering my question “What is” he said, “It’s a site on the web.”

Sure enough, it’s there, for the whole world to see. To find out the grade point averages of grades assigned by professors in my university and in many other universities all anyone in the world has to do is punch in a computer search engine and hit the search button.

The advent of the Internet and enabled me to learn Fall 2002 some of the most significant and relevant facts I have ever learned about my teaching. Since grade distributions assigned by teachers were always kept secret in our business school, I had no idea how my grades compared to the grades of other teachers. According to pickaprof. com, my grades have been considerably lower than the grades of all other teachers teaching the same courses I have taught and have been lower than those of most teachers in the college. My grade point average posted on for an integrative capstone business policy course required for all business majors I had taught for undergraduate students 32 years was 2.47, which was, I had always thought, about right. According to, the next lowest grade point in the course assigned by my colleagues was 2.86, and, among 6 or so teachers usually teaching the course every semester, they ranged from 2.86 up to 3.78, on a 4-point scale. Perhaps as many as half the teachers were assigning almost nothing but A’s and B’s in some undergraduate courses, particularly in courses that were designed to teach students to think and integrate different business disciplines such as economics, accounting, finance, management, and marketing.

Grade point averages posted on are real since they are downloaded directly from registrar’s offices in states that have open records laws, as does our state. A faculty member who assigned relatively high grades whose salary was higher than mine bullshooting with me in the hall one day about the morality of grading (Baier, 1958; Bedeian, 2002; Haskell, 1997; Mieczkowski, 1995) said, “Hey, I’m just a working class guy doing what I thought I was supposed to do.”

The student entrepreneurs starting may do more to De-Game higher education than any innovation so far, if they stay in business, since they are putting a lot of truth on the table. They have already removed a veil of ignorance from many eyes. While hundreds of Internet businesses have gone bankrupt and have vanished since the technology bubble burst in 2000, some, like eBay, have generated accounting profits. It would be helpful, in my judgment, if several new Internet businesses like were started to post the grade distributions of all teachers in all schools and could make enough profit to stay in business indefinitely.

The Internet is not only causing people to know more about grades in education; it is causing people to know more about prices in business. While grade inflation in education has been rampant, price inflation in business has been quiescent in recent years. Because of people knowing more about prices for goods, services and wages worldwide, businesses and businesspeople everywhere are subject to more competition, which holds down prices for consumers. Whether this will happen with grades because of the Internet remains to be seen. If employers can easily learn which grades are most inflated, this might cause schools and teachers producing inflated grades to improve, since few employers would want to hire their students at competitive salaries (Forbes, 2003). Rather than simply requiring job applicants to have a high grade point average, businesses and other employers might be more concerned with what and how much students actually learned in school.

Global business practices aided and abetted in recent years by personal computers, the Internet, cell phones and rapid delivery services for real goods are changing at an increasing rate, helping some people and hurting some people, in rich and poor countries alike. How human relationships will sort out worldwide because of this process is far from clear. Hopefully an ultimate result will be that life will become more truthful, honest, fair and just for more people. Most schools, businesses, governments, and other organizations as usual are sailing in uncharted waters, hoping current courses and practices will work for most students, members, participants, employees, associates, managers, workers, stockholders, administrators, proletarians, parishioners, congregants, devotees, aristocrats, voters, citizens, monarchs, officials, subjects, or whatever as long as possible.

Academic Alchemy

Teaching evaluation processes would be improved if educated and fair-minded adults would read the grading procedures and tests used in courses and randomly visit classrooms to see, hear, and feel what is going on when teachers teach. This is not allowed or encouraged in most cases because administrators say they do not have time to personally visit the classes of faculty within their units or supervise parents or other faculty observing teaching. Most college teachers and administrators would probably say in-class observations would create more problems than they would solve. I invited administrators to randomly visit my classes to see what was going on. No one ever accepted my offer for a random visit. Two or three administrators visited my classes once or twice each in my 35-year career by appointment. One administrator became upset when I suggested he randomly visit my classes to see what was going. He accused me of being unethical since he said I should have known he did not have time to visit the classes of all professors in the department randomly or otherwise. Consequently, almost all colleges and universities use student evaluations to grade teachers. Numerical grades assigned by students are summed and averaged to develop ranks of teachers. Teachers with the highest numerical scores are assumed to be the best teachers. This is problematic for several reasons.

In the first place, using student evaluations to rank faculty assumes all students are ethical, capable, and rational evaluators. Some students are far from ethical, rational and capable when they evaluate teachers. Some take revenge against teachers if they feel a teacher has been too demanding or unfair. Only a small percentage of students taking revenge against a teacher in a student evaluation can insure that the teacher will be ranked in the lower half of the department, even if the majority of the students in the class thought the teacher was a good teacher, even if some students thought the teacher was excellent.

In the second place, using student evaluations to rank faculty assumes all teachers are ethical, capable, and rational. While most teachers are ethical, capable, and rational, some are not dealing with the ethical problems student evaluations entail. In order for a teacher to insure that she or he is not ranked poorly in a student evaluation, a teacher must make sure that he or she pleases almost every student in the class, including students who are below average in the subject, and like it or not, most college or university classes will include some students who are below average in the subject of the course. Pleasing all students entails the teacher not requiring anything that below average achievers cannot or will not do, spoon-feeding sound-bite-like answers in class to tell all students what to memorize for tests, or curving grades in such a way that D grades look like C or better grades.

Taking numbers on a scale of 1-5, from poor through excellent, reflecting the judgments of poor, average, good, and excellent students in a class regarding a teacher and then summing these numbers and then dividing by the number of students in the class to derive an average class score is like combining baskets of apples, oranges, plums, bananas, and grapes to create a basket of pears in a process of academic alchemy. If 20 percent of the students in the class thought the teacher was poor and 20 percent thought the teacher was excellent, which group was really right? Assuming truth about teaching exists, this is an important question. Both groups cannot be right, and a rating in between is also not right. The difficulty of the problem is compounded if the 20 percent of the students who thought the teacher was poor were poor students and the 20 percent of the students who thought the teacher was excellent were excellent students. Adding 20 percent apples to 20 percent grapes will not magically produce any percent of any other kind of fruit.

Student evaluations are not true propositions. Ranking faculty based on averages generated by student opinions, biases, prejudices, emotions, and the like and then using the ranks as if they were true propositions is a form of academic alchemy; yet supposedly intelligent and educated administrators do this almost everywhere year in and year out. Administrators do this because using these mindless ranks is the easiest way out for them, not requiring them to take responsibility for taking into account and analyzing all the facts and evidence generated by the teaching of teachers within their administrative units. They also use these ranks to defend themselves from faculty who might complain about a merit raise based on teaching. The administrator can point to the numerical rank and tell the faculty member his or her raise relative to the raises of other faculty was proportionate to the teaching rank of the faculty member relative to other faculty members.

Student evaluations are necessary in colleges and universities to provide evidence showing what all students thought about the teaching of teachers, to prevent teachers being at the mercy of random student hearsay conveyed to administrators personally in hallways and offices, but they should not be used to develop ranks obviating the need for rational analysis of all the facts and evidence generated by the teaching of a teacher, or the learning of his or her students. Summary ranking in some cases may be helpful if the summary ranks are created in an appropriate way, but even this ranking should be but one part of the overall analysis of the teaching of a teacher.

Optimizing the Fairness of Student Evaluations

For more information on how to create appropriate ranks for teaching evaluations, see my article “Optimizing the Fairness of Student Evaluations: A Study of Correlations Between Instructor Excellence, Study Production, Learning Production, and Expected Grades” (Stapleton & Murkison, 2001), published as a lead article by the Journal of Management Education. Summaries of this article have been posted on the Internet by professors concerned about student evaluations internationally. You can probably still find my Optimizing Fairness article on the Internet by typing my name into a search engine. On the other hand, you can secure an electronic copy of “Optimizing the Fairness of Student Evaluations” from the Internet by typing Journal of Management Education into a search engine.

My article “Optimizing the Fairness of Student Evaluations” has by now January 23, 2010 been cited as a reference by university researchers in 32 published journal articles concerned with the problem of student evaluations, according to a Google Internet search. Various professors around the world on their personal web sites have posted the article, and several centers and Sage Publications, the publisher of the Journal of Management Education, have included the article in anthologies of important articles dealing with educational issues and problems. To read from your computer these articles, anthologies, and a PDF copy of “Optimizing the Fairness of Student Evaluations” all you have to do is punch Optimizing the Fairness of Student Evaluations into your Internet search engine and access the web sites that will appear on your screen. See especially “Optimizing the Fairness of Student Evaluations,” in Learning from Journal Articles: Strategies for Maximizing Student Learning by Larry C. Holt and Marcella Kysilka ( holt/articles/index.htm).

If student evaluation ranks are used, ranks should be created for instructor excellence, study production, learning production, and relative expected grades, and all four of these ranks should be shared with faculty. If a summary rank is to be created for each teacher in the department, it should be a rank of the averages of ranks for instructor excellence, study production, learning production, and relative expected grades. I call this summary rank a Composite Indicator of Teaching Productivity (CITP) (Stapleton & Murkison, 2001). The most important letter in the CITP is the I (Indicator). The CITP, as relatively sophisticated as it is, is merely an indicator of teaching productivity. As with all summary ranks of teaching, the CITP should not be used to obviate the need for an overall analysis of all the facts and evidence generated by the teaching of teachers. Whoever is responsible for evaluating teaching should not be allowed to abdicate his or her responsibility for fairly evaluating and rewarding the teaching of a teacher relative to other teachers by irresponsibly and mindlessly adding up some numbers, deriving a simplistic rank, and then dividing up the strokes and the merit raise money for teaching accordingly.

I did not reprint “Optimizing the Fairness of Student Evaluations” in Business Voyages because of the length of the article, 33 pages, and because it contains some tables and graphs that would have been difficult to reproduce in the Business Voyages format. Basically the article contains correlations between 14 variables (representing14 questions on the student evaluation questionnaire) in pairs of two using data aggregated by the department of management as a whole, 29 faculty members and 1,251 student evaluation forms, creating a 14 row by 14 column matrix, and 4 correlations using data aggregated by specific faculty members. The 4 correlations aggregating data by specific faculty members, including all 29 members and all 1,251 forms, are between Instructor Excellence and Study Production, Instructor Excellence and Learning Production, Instructor Excellence and Relative Expected Grades, and Study Production and Learning Production. The SPSS statistical package was used to compute the matrix containing 14 rows and 14 columns showing the correlations between all 14 variables using both means and medians, and Pearson correlation coefficients using means and Spearman correlation coefficients using ranks were computed between the four pairs of variables mentioned above. Scatter diagram graphs are used to show visually the correlations of the four pairs of variables for both the Spearman and Pearson correlations. There were no statistical differences between the correlations of the means and medians in the SPSS matrix, so medians were discarded.

The data and analysis show that making assertions, generalizations, or propositions about student evaluations is not a simple matter. The data and analysis of Optimizing Fairness show that results vary not only by the way statisticians aggregate the data of student evaluations, that is, whether as a department or university as a whole or by specific faculty members, but also by what they use as data for the correlations, that is, means or ranks. If you lump all student evaluation questionnaires into the computer by department, school, university, or whatever, correlations between study production and instructor excellence are nonsignificantly positive (.09, p = ns, in our case), indicating the teachers rated highest as instructors may require as much homework as teachers rated poorest as instructors. On the other hand, if you tell your computer to lump the student evaluations of each faculty member together, as I did, and then compute a correlation between instructor excellence and study production between faculty members using ranks, the resulting correlation is negative (-.2793, p < .07), indicating those instructors requiring the most homework were generally rated poorest as instructors and those instructors requiring the least homework were generally rated highest as instructors. What this boils down to is that if researchers aggregate data by whole administrative units the computer treats the data as if students were taught in one huge class by one teacher, washing out the differences among instructor excellence scores and study production levels among different teachers, hiding the differences between the hard and the easy teachers.

On the other hand, the correlation between instructor excellence and relative expected grades was negative regardless of how correlations were computed, although the negative correlations using data aggregated by specific faculty members were much higher than the correlation using data aggregated by the department as a whole. The correlations between instructor excellence and relative expected grades using data aggregated by specific faculty members were -.56, p = .00 using means and -.61, p = .00 using ranks, whereas the correlation from the SPSS matrix compiled from data aggregated by the department as a whole using means was -.26, p < .10.

Here is the conclusion of the article:

"Because a positive linear relationship between study production and learning production ranks does not exist in this study, (see Table 5) and because there is a negative relationship between study production and instructor excellence ranks (see Table 2), it is possible for some percentage of faculty members to lower homework requirements and grading standards to increase expected grades production (see Table 4) and to increase their instructor excellence scores and learning production scores (see Table 3) on some student evaluations; and conversely, it is possible for some percentage of faculty members to lower their instructor excellence scores on some student evaluations by increasing homework requirements, raising grading standards, and lowering expected grades" (Stapleton & Murkison, 2001, pp. 289-290).

The article is titled “Optimizing the Fairness of Student Evaluations” because there is no way you can devise a system that will please all teachers in a school. If you only rank Instructor Excellence you will discriminate against challenging teachers. If you rank only Learning Production you will discriminate against teachers who score high on Instructor Excellence but low on Study Production and Learning Production. Since a CITP weights equally ranks for Instructor Excellence, Study Production, Learning Production, and Relative Expected Grades, it tends to create the greatest possible satisfaction for the greatest number of faculty members, an optimum solution providing a modicum of fairness for all players. This is the best you can hope for, since there apparently is no perfect solution.

This CITP solution is better than inanely ranking only Instructor Excellence, as was done in the business school at Georgia Southern by many administrators for many years, and as has been done, and is still being done, by thousands of administrators in schools worldwide. I think I have about taught administrators in the business school at Georgia Southern at least that you should not do this. On the other hand, Georgia Southern still refuses to put a relative expected grades question on the student evaluation form used campus-wide, including the business school. In 2000 I convinced Georgia Southern they should put study production and learning production questions on the campus-wide form, but instead of following my instructions to use a relative expected grades question asking students if they expected a grade in the course higher or lower than they normally expect in courses given how much they worked and studied in the course, Georgia Southern decided to simply ask students what absolute grade (that is whether A, B, C, etc.) they expected in the course, which diminishes the effectiveness of the CITP process, biasing the game in favor of the less challenging teachers in a clearly sub-optimum way. Asking students whether they expect an A, B, C, D, or F proves very little, since many students expect A’s and B’s, no matter what, otherwise they will lambast the teacher on the student evaluation. Some students expect B’s even if they have 60 averages at the time they fill out their teacher evaluation forms even in courses taught by challenging teachers, since they expect the teacher to cave in and use a curve if necessary to produce a B for a final grade. Many students have been taught to expect this teacher behavior in high schools and in other college and university courses. A’s and B’s, after all, are necessary to get and keep a Hope Scholarship in Georgia. Yet, even if students expect high grades from challenging teachers, and even if there are small differences in means generated from numbers for absolute expected grades, if you rank the mean scores, in most cases the ranks will show much of the real difference between challenging and pandering teachers, which is another reason ranks and not means should be averaged for the summary ranking.

My research findings regarding the significant and pernicious effects of relative expected grades at Georgia Southern substantiate and generally replicate earlier research and findings of Greenwald and Gillmore (1997a, 1997b) regarding the significance and effects of relative expected grades at the University of Washington.

For an example of a relative expected grade question and how it can be used please skip ahead to page 516 in Business Voyages and see how one was actually used as reported in my Transactional Analysis Journal article “Teaching Business Using the Case Method and Transactional Analysis: A Constructivist Approach” (Stapleton & Stapleton, 1998).

A Relative Expected Grade Question

The relative expected grade variable is the most significant determinant of the overall teacher evaluation by students.  Here is an example of a relative expected grade question that should be on all student evaluation forms to optimize fairness:  

Given my efforts in this course, the grade I expect to receive may not be the same I think I deserve. It will be
1 = much lower
2 = lower
3 = the same
4 = higher
5 = much higher

(I added the immediately above paragraph and question to this case August 15, 2016 as an afterthought.  The question is an exact copy of the relative expected grades question we used on the student evaluation form used for our Optimizing the Fairness of Student Evaluations study and article).

An exhaustive and thorough research study dealing with problems of student evaluations was recently published by Judith D. Fischer (2004) an associate professor in the Louis D. Brandeis School of Law at the University of Louisville in Legal Writing: The Journal of the Legal Writing Institute titled “The Use and Effects of Student Ratings in Legal Writing Courses: A Plea for Holistic Evaluation in Teaching,” that included some of my research and the research of Greenwald & Gillmore, as well as many other studies. Here is what Fischer had to say about my research in “Optimizing the Fairness of Student Evaluations”:

"A recent study by Stapleton and Murkison (2001) dramatically showed the limits of the term “valid” as applied in student ratings. Their study showed a positive correlation between student ratings and student-reported learning. But theirs was perhaps the only study to break down and report data by professor. Broken down that way, the data revealed that some instructors confounded the general trend: of the twenty-nine instructors studied, four who produced learning in the top half received ratings in the bottom half, while four who produced learning in the bottom half received ratings in the top half. Had personnel decisions been made on the basis of these data, with a cutoff at the median, four of the more effective professors would have been punished or dismissed, while four of the less effective ones would have been rewarded. This study highlights an important point about statistical data: an overall correlation between two variables does not mean that one variable is always correlated with the other in particular instances.

"Stapleton and Murkison did not attempt to explain why eight professors confounded the general pattern. But studies have demonstrated that a number of factors other than teaching effectiveness influence ratings" (Fischer, 2004, pp. 123-124).

Professor Fischer then goes on to discuss in her article some of the major factors affecting student ratings, citing the work of Greenwald and Gillmore (1997a, 1997b) regarding the importance of relative expected grades and the findings of several other studies. Apparently law schools have many of the same problems as business schools using student evaluations and there are also pressures in law schools for professors to spoon-feed right answers to make it easy for students to make high grades on tests. Apparently grading the thinking, reasoning, and relative comprehension of students on written exams, class participation, and papers in law schools is fraught with hazards. Given the importance of law students graduating with a high overall academic ranking relative to their peers for professional purposes, law school professors may have to worry even more than other kinds of professors about vengeful students retaliating on student evaluations.

Entertaining teaching by cheerful and attractive teachers may attract fun-loving students and build up enrollments, but such teaching may not produce significant learning. Significant learning is generally produced by people being exposed to perturbing experiences that convince them their current scripts and schemata are obsolete and that new or better ones are necessary for dealing with environmental threats. Rather than produce happy-meal experiences in class, some effective teachers produce perturbing experiences that motivate students to learn in significant ways. Needless to say, this will not produce a high student evaluation rank given the averaging process most such ranks entail. A CITP averaging process, however, improves the summary ranks of challenging teachers because it weights equally instructor excellence (pleasing most students), study production (displeasing many students), learning production, and relative expected grades (displeasing many students). The CITP gives challenging teachers credit for displeasing some students, which was necessary to produce perturbing experiences necessary to produce significant learning.

Hundreds of articles have been published in professional journals, magazines, and newspapers in the last 50 years dealing with problems created by students evaluating teachers in colleges and universities, yet to this day student evaluations continue to create frustrations and injustices that bedevil many teachers. It’s possible most teaching evaluation systems will always be based on academic alchemy and serious teachers will have little choice but to put up with the make- believe realities of teacher evaluation systems. Although professors in colleges and universities may attain tenure and academic freedom, they are never free from being subjected to yearly teaching evaluations conducted by students and supposed superiors in administrative positions. Going through the yearly teaching evaluation process in colleges and universities is like being a defendant in court every year, with students serving as the jury and an administrator serving as the judge.

The process has advantages and disadvantages. On the one hand it creates pressure and stress because no one wants to be labeled a bad teacher and student evaluations can have significant consequences, low raises for those with tenure, and even higher stakes for those without tenure, not getting tenure, and possibly no promotion to associate or full professor for those with tenure. On the other hand, student evaluations create goals for professors which if achieved result in satisfaction and a feeling that one has accomplished something worthwhile. Successful teaching is causing students to learn new skills, knowledge, and dispositions; but securing a high student evaluation score entails creating a strategy that requires no or some or much homework, no or some or much research and writing, essay or quantitative or multiple-choice or true-false exams, no or some or much graded class participation. Creating and implementing this strategy entails uncomfortable and stressful moral and ethical choices and tradeoffs.

The Morality of Grading

There is evidence grade inflation is rampant, not only in public schools, but in colleges and universities, including teachers at top tier universities such as Harvard as well as those at Podunk State (Fischer, 2004; Greenwald & Gillmore, 1997a, 1997b; Healy, 2001; Johnson, 2002; Stapleton & Murkison, 2001; Stapleton & Stapleton, 2003b; The Economist, 2001, 2002). What’s wrong with grade inflation you might ask? It’s wrong because it’s immoral. Telling falsehoods is immoral (Baier, 1958); inflating grades is telling falsehoods; therefore inflating grades is immoral.

Students in most classes are not equal achievers. If you calibrate (dumb down if you will) your questions and course requirements in such a way and to such a degree that the poorest achievers in all or most of your classes make A’s and B’s you are telling the falsehood that your poor students are really good or excellent. To fairly grade you should give each student what she or he deserves based on achievement relative to his or her peers. What is a peer in this context is debatable but in general it is any student subject to doing and learning what you teach. If you give the same grade to excellent and poor achievers in your class you are being unfair to your excellent achievers by depriving them of recognition for their achievements. If you lower your standards in such a way as to make it impossible for your better students to achieve at an excellent level relative to your poorer students you are depriving your better students of the opportunity to achieve at an excellent level, which is unfair because you are depriving your better students of recognition they could have earned.

If you eliminate these opportunities for your better students you will probably decrease the incentives and motivation to learn of all students in your classes, including your poorer students, since they will not be stimulated to exert themselves attempting to stay up with the better students. If it is your purpose not to grade students at all that is one thing, but if your system says you are supposed to grade students as excellent, good, average, poor, and failing, if you lower your standards in such a way that poor students can answer all or most of the questions right or do everything you say they are supposed to do in what you define as an excellent or a good way and all students make A’s or B’s you in effect are not grading students as your system says you are supposed to grade, which is also immoral, assuming your system is moral and honestly wants you to grade students as excellent, good, average, poor, and failing.

But what if all students actually learn 80 percent or more of the content of the subject you are teaching in your course, shouldn’t all students in the course receive A’s and B’s? Yes they should if that really is the case. On the other hand it seems to me in the vast majority of cases it is ridiculous to say students learn 80 percent or more of everything there is to learn about the subject being taught in a course. The content of courses is almost always a sample of the total content of the subject; and if you make your sample unduly small and spoon-feed students everything they need to know for tests you are engaging in a form of immorality as outlined above.

In my opinion the moral way to do it is to set your standards high enough that they truly challenge your excellent students and grade the achievements of all your students relative to the achievements of your excellent students, ascertaining whether your students achieve at good, average, poor, or failing levels relative to your excellent students, and let the chips fall where they may. What do you think would happen if football and basketball coaches were to say all their players were good and excellent and tried to let all of them play on the first string?

You can honestly build a case that grading people is immoral period because it stigmatizes and all people should be accepted unconditionally; and you can honestly argue that grading in schools, colleges, and universities should be abolished, or that schools, colleges, and universities should be abolished. Given the way things are, however, I can also honestly argue that if someone does not want to be fairly and honestly graded then he or she should not enroll in mainstream courses in schools, colleges, and universities.

In the case of grading in schools, colleges, and universities you can argue that inflating grades is moral because it helps the worst off and it might cause the greatest good for the greatest number. But it seems to me after all is thought and said, in an educational environment, truthfulness as a moral criterion should trump all others (for sure no pun intended these days, circa July 16, 2016).

Why is morality important you might ask? In my book morality is essential for the functioning of a decent, productive, and satisfying culture—which a good teacher can create in a classroom in some countries. The more teachers there are creating good cultures in classrooms in as many countries as possible the greater the chance decent, productive, and satisfying cultures shall some day exist among all people aboard Spaceship Earth.

A Question of Free Will

Whether free will exists is a relevant question. If it exists humans can learn to be autonomous as TA people espouse, and humans can use their Adult ego states to read and observe facts, phenomena, processes, books, articles, and other intellectual and artistic creations and create and choose profitable and ethical things to feel, think, and do regardless of script messages, rackets, and Games passed down from ancestors that are introjected into and stored in Parent and Child ego states in brain cells and sub-conscious minds (Allen, 2003; Barnes, 1994; Clarke, 1999; Erskine & Moursund, 1988; James, 1974; McCulloch, 1988; Osnes & Gesme, 2000; Popper & Eccles, 1977; Steiner, 2003).

On the other hand, we humans may be like dominoes falling in cause-effect chains. A Princeton professor A. L. Goldman in his book A Theory of Human Action (1970) proposed and presented sound analysis and reasoning for the proposition that every effect has a cause, even effects such as human feelings, thoughts and wants and therefore human action is deterministic. It seems to me this is plausible even if humans never develop the capability to detect and track the exact cause- effect sequences leading up to specific human actions. It seems plausible to me that human actions that appear to be spontaneous, random, or accidental were actually caused, deterministic, and inevitable. If this is true in general let us pray humanity is being caused to evolve structures and processes that will result in survival and satisfaction.

According to Ron Powers in his new biography Mark Twain: A Life (2006), Mark Twain thought humans were machines doing what was inevitable with their lives. I have no idea what sort of reasoning Mark Twain might have used to reach this conclusion but most likely the conclusion was based on his observation of humans in real life. No doubt Mark Twain had greater powers of visual and auditory recall than most people or he could not have written the stories and books he wrote and he probably accurately saw and heard people talking and acting in programmed ways. Most psychiatric theories of human behavior were also based on similar observation of people, except in this case most of the observations took place in counseling sessions in offices and clinics. Most transactional analysis theories were developed based on observing human behavior and hearing about human behavior in clinical sessions. Albert Einstein was a strong believer in determinism and strict cause-effect relationships making things happen in the physical universe, including people.

Eric Berne the founder of transactional analysis wondered if he was like a player piano producing music written by his parents. If so he wrote it was probably better music than he could have written himself (Berne, 1970). He also thought only trivial decisions are generally made in the here and now using facts and reason, and that major decisions are generally a function of scripts co-constructed through time in the minds of decision-makers to enhance what are considered to be their best interests.

Some Greek Stoic philosophers and later philosophers such as David Hume believed it was possible for determinism and free will to simultaneously exist since although people might be caused to feel, think, and do what they feel, think, and do by external events, in most cases they are not forced to choose what they choose by an external force such as a more powerful person, and therefore their choices can be free. This view that it is possible for determinism and free will to simultaneously exist is known as compatibilism (

Strict determinists, on the other hand, argue that people are not free to make arbitrary choices because they must choose what they choose because of decision rules and criteria previously recorded in their brains. In other words, given the internalized rules and decision criteria people have come to learn and believe because of past experience, individuals have no choice but to choose what they choose when confronted with facts, objects, and decisions in external reality, and therefore their choices are not free.

The case can be made that people are analogous to computers and can only do what they have been programmed to do. In such a case the good news is that old programs can theoretically be changed and new programs can be read in; the bad news is that dysfunctional and deleterious programs in people most of the time are difficult to reprogram or replace. Since most people have been programmed to compete, many will fight to prove their existing programs are superior no matter what. For progress to occur powerful programmers must be caused to learn good programs that enable them to cause people with deleterious programs to learn good programs who then pass on their new good programs to others, and so on, in salutary never-ending cause-effect chains. Something causing schools, colleges, and universities to be populated with good programmers programmed with good programs is vital for such a process.

The most knowledgeable and intelligent physicist does not know with certainty what caused the Big Bang, assuming a Big Bang occurred some 10 billion years ago creating this universe as many scientists think. No one knows with certainty whether there is an infinite regress of causes, behind the Big Bang, or whether there is a non-caused cause, i.e. an infinitely intelligent free will, existing eternally that caused all phenomena in space starting from nothing. No one knows with certainty whether there have been any number of Big Bangs creating any number of universes in an infinity of time and space.

It’s possible we humans are not presently equipped to prove what is the root cause of life, although some scientists have come close (Kauffman, 1993; Weiner, 1995). A tough relevant test question is: How could a reproductive life form spring forth from rocks, chemicals, gases, liquids, and electromagnetic forces in a primordial soup billions of years ago on Earth? Was life transported to Earth via asteroids?

According to Buckminster Fuller, Albert Einstein, and other top scientists (Fuller, 1969, p. 62), the purpose of science is to “honestly attempt to set in order ‘facts of experience’.” Based on this definition I think most case method teachers and transactional analysts have been scientists.

I retired after 38 years of business teaching in May 2005, 29 years of which involved the use of transactional analysis. I received a high annual stroke income (McKenna, 1974) from students throughout my career for which I am grateful. Because of this it’s possible my spinal column has shriveled relatively little compared to most people my age, 64. I think learning about transactional analysis in the 1970s caused my teaching career to be more productive and satisfying than it otherwise would have been, and I trust my teaching transactional analysis will help my former students navigate satisfactory business voyages. Many have told me they thought it did up to some point in time (Stapleton, 2003). I have taught some 3,500 students in university business classrooms all of whom were OK.

POSTSCRIPT:  I covered some of these issues in my 2016 book Born to Learn.  I wrote a short piece I called Appendix I at the back of the book recommending readers read our (Stapleton & Murkison) article mentioned in this case, titled "Optimizing the Fairness of Student Evaluations:  A Study of Correlations Between Instructor Excellence, Study Production, Learning Production, and Expected Grades," published in the Journal of Management Education in 2001 by the Organizational Behavior Teaching Society.  The JME is printed and published by Sage Publications, a major publisher of academic materials.  I pointed out in Appendix I "Optimizing the Fairness of Student Evaluations" has by 2016 been cited in 61 refereed professional journal articles in several academic disciplines, proving the article has been read and used.  As I pointed out above in Games Educators Play, the article had by 2003 when I wrote the case been cited in 31 journal articles. 

In Appendix I of Born to Learn which I wrote in 2015 I told readers they could access a free PDF copy of "Optimizing the Fairness of Student Evaluations" by entering this web address into the search slot of any browser, http://www.sagepub. com/holt/articles/Stapleton.pdf.  Unfortunately this is no longer possible.  I entered the address today (July 11, 2016) in the Google search slot and the page could not be found. 

You can find any number of results, citations, and web addresses for "Optimizing the Fairness of Student Evaluations" by simply punching Optimizing the Fairness of Student Evaluations into Google or any browser, but, alas, you can no longer find a free PDF copy on the web, or at least I could not.  You can find addresses verifying the number of citations, now 61, and you can find a PDF copy at Sage Pubs, for $36, more than any of my books cost.  I have no idea why Sage Pubs provided a free PDF copy for 15 years and then took it down, or if it happened by accident some way. 

On the other hand you can find print copies of the Journal of Management Education in any good academic library, but few readers will take the time to look the article up.  The article is thirty-three pages long and contains numerous exhibits, charts, graphs, diagrams and the like substantiating the findings.  The article provides hard evidence, proof some say, that teachers can in some cases increase their student evaluation scores and their merit raises by lowering the requirements and grading standards of their courses, by dumbing them down and teaching to easy tests.  Using the CITP, a Composite Indicator of Teaching Productivity, as described above, in the department or school will eliminate this possibility and optimize fairness for all teachers in the department or school.  Most teachers are just like most people in any vocation or profession.  They like to be recognized for doing a good job and want to be fairly rewarded based on their relative productivity and contributions.  The CITP will insure this happens.

These issues are discussed in all my books offered for sale on this website on the Effective Learning Publications page.  Business Voyages covers the CITP in more detail than Born to Learn or Recommendations for Waking Up From the America Nightmare.  On the other hand, Born to Learn provides a more comprehensive coverage of transactional analysis (TA) concepts and techniques, showing how they apply to such specifics as teaching methods, classroom layouts, testing and grading methods, student motivation, classroom management, and learning contracts.    



While it is still impossible to find a pdf copy of Optimizing the Fairness of Student Evaluations by typing the above address into search engines on the Internet (so far as I know), by sheer luck as I was reading (December 2016) an old Internet post of mine I came across a link to the full text article as linked below.  I clicked on it and, lo and behold, up the article popped, for all the world to see, in its full glory and grandeur, as if resurrected from the dead.  So here it is for your edification and enjoyment.

I have no idea how this happened, not being an expert regarding the linking and storing of articles on the Internet.  Please believe I am not making this up.

Published in 2001 by the Organizational Behavior Teaching Society in the Journal of Management Education, this article by Stapleton & Murkison has now been cited in 61 refereed professional journal articles, providing insights into how to evaluate teaching and learning in schools, colleges, and universities, showing how difficult it is to fairly evaluate teaching and learning and why relative expected grades questions should always be included on student evaluation forms to provide a modicum of fairness.

To verify the 61 citations just punch Optimizing the Fairness of Student Evaluations into Google and read the sources.

Optimizing the Fairness of Student Evaluations gets to the heart of intractable problems of the teaching profession, the most serious of which is probably teacher evaluations. How can you or a teacher know how well a teacher is doing his/her job? What sort of criteria can you use for making this judgment? Certainly the purpose of teaching is to cause learning to occur in students, but how do you measure this? What kind of learning? How much learning? How much learning relative to what? What percentage of a prescribed content or syllabus a teacher causes students to memorize? Or how much learning a teacher produces in students relative to how much peer teachers produce? In other words are you attempting to measure absolute learning or relative learning?

Optimizing the Fairness of Student Evaluations presents a unique Composite Indicator of Teaching Productivity (CITP), one of the most sophisticated metrics of teaching productivity yet developed in the teacher evaluation literature.

Check it out at