Prospect Theory, Bias, and Chalk: Our 2017 March Madness Wrap

By in Sports

Congrats to the First Place Loser

Let’s start off in the obvious place: Mike Philbrick, the poor-man’s Gronkowski, went wire-to-wire in last place.

That makes us happy, and so first and foremost, we come to bury him.

It’s entirely his fault.  We know because the scoring rules were such that, assuming public betting markets are reasonably good proxies for the true odds of a team winning a particular game, every bracket from the most sophisticated strategies to the purely random should have had the exact same total expected return, and hence an equal chance of victory.

The rules were set this way in order to maximize the odds that skill would emerge, and so to whatever extent we were successful in doing that, Mike has none.

Source: mysticalpha

Sample Size is Small No Matter What

If Mike can scramble for one dignity-sparing life preserver, it would be in this message that Adam Butler posted to our internal chat system after the second round table was posted:

LOL that Dan Adams (last year’s winner) and MP (last year’s 2nd or 3rd place right?) are in a dead heat for this year’s last place. If there was ever a signal that this whole concept is a racket, this is it… hilarious.

It’s worth mentioning that Adam didn’t even submit a bracket this year due to a taupe-colored and carboard-flavored aversion to whimsy.   As he posed to me during the editing of this very article, “Why not flip a coin 63 times and have people guess the sequence?”

Ridiculous though it may seem, he does have a point: while our rules were designed to maximize sample size, the largest possible sample size is capped at 63 (the total number of games in the tournament).  Unfortunately, even in our system, legacy errors still exist, and as we’ll discuss below, your entries still displayed remarkable bias, making the effective sample size considerably less than 63.  Even under ideal conditions – which our bracket challenge did not achieve – the sample size for March Madness is small enough that the outcome in any single year will appear lottery-.

Investing is different because we have the opportunity to apply strategies repeatedly over a long period of time.  While the outcome of any short-term investment sub-period may have characteristics of randomness, over many repetitions patterns emerge.  This opportunity isn’t available in March Madness, but unlike Adam’s assertion that this makes the entire thing a “racket,” the less sports-averse among us took this as the fundamental issue necessitating the largest sample size possible.  After all, the larger the sample size, the less random the outcome.

This is why we equalized the expected points per team.  In our pre-tournament post, we wrote:

If we must endure legacy errors – and we really can’t see any (unoppressively demanding) way around it – then the next best option is to create parity between the total expected values of every team.  And the best way we know how to do that is to award points in inverse proportion to a team’s likelihood of advancing to a certain point in the tournament.

We then asked a very simple question:

if the expected value of every team is the same, how are you going to make your picks?

And your collective answer was to get out a big ‘ol crate of chalk.

Our March Madness Entries Were Still Irrational

When we set the total expected values equally for every team, the goal was to elicit a meaningful uptick in the number of upsets chosen.  In fact, in a perfectly rational world where betting markets perfectly reflect each team’s true odds of winning, the distribution of picks in the first round would have approached 50% for every team, regardless of seed.

Of course, that didn’t happen.

Judging by the divergence of our first round picks from Yahoo’s, where they use “standard” scoring rules, ReSolve’s scoring method barely made a dent in your approach to selecting winners:

Figure 1. First Round Pick Distribution for ReSolve and Yahoo March Madness Bracket Challenges, Yahoo Concensus Favorites, 2017

Overall Seed

Team

Yahoo Public Picks

ReSolve Picks

Difference

1

Villanova

99%

93%

6%

3

Duke

99%

91%

7%

2

Kansas

99%

91%

7%

4

North Carolina

99%

96%

3%

5

Gonzaga

98%

93%

6%

6

Kentucky

98%

96%

3%

7

Arizona

98%

96%

2%

8

Louisville

98%

93%

5%

9

UCLA

97%

93%

4%

10

Oregon

95%

86%

10%

11

Baylor

94%

94%

-1%

12

Butler

93%

70%

23%

13

West Virginia

91%

84%

7%

14

Purdue

89%

76%

13%

15

Florida State

86%

80%

5%

16

Notre Dame

85%

83%

2%

17

Florida

85%

69%

17%

18

SMU

79%

69%

11%

19

Iowa State

79%

79%

0%

20

Virginia

79%

39%

40%

21

Michigan

76%

67%

9%

22

Cincinnati

75%

71%

4%

23

Wisconsin

72%

39%

33%

24

Wichita State

71%

67%

4%

25

Marquette

64%

79%

-15%

26

Michigan State

61%

66%

-5%

27

Maryland

58%

44%

13%

28

Creighton

58%

43%

15%

29

Seton Hall

56%

30%

26%

30

Minnesota

55%

57%

-2%

31

Saint Mary’s

55%

57%

-2%

32

Vanderbilt

51%

53%

-2%

 

In only five cases (highlighted above) did our rules sway more than a 15% change in your attitude while bringing the overall pick percentage closer to 50%.

The overall pick bias was remarkably consistent, too:

Figure 2. Frequency of ReSolve First Round Picks by Seed

Don’t be confused by the bump in the 10 seed, either.  That was largely driven by Witchita State, a team that was an underdog by seed, but a 70% favorite in Vegas.

Also, don’t think that this phenomenon was isolated to the first round.  Bias towards higher-seeded team flowed deeply through most brackets, as evidence by the distribution of our champion picks:

Figure 3. Frequency of ReSolve National Champion Picks

Team

Seed

Frequency

North Carolina

1

11

Gonzaga

1

10

Kansas

1

10

Villanova

1

8

Oregon

3

7

Kentucky

2

6

Duke

2

5

Arizona

2

4

Michigan

7

2

Maryland

6

1

Notre Dame

5

1

Cincinatti

6

1

Creighton

6

1

Virginia Tech

9

1

Florida

4

1

Marquette

10

1

 

87% of our entrants chose a 3 seed or higher to win it all.  And sure, the plurality pick and 1-seed North Carolina won the National Championship, but you have to go all the way down to our #7 finisher to find the top bracket to have accurately predicted that.  That bracket – submitted by “Hinch” – finished 20 points off the leader, and even then, North Carolina accounted for a meager 12% of Hinch’s points.

Around here, chalk doesn’t pay.

Why Did People Still Pick Favorites?

There are four possible reasons that people were biased towards favorites:

  • An incomplete understanding of our scoring rules, which caused people to adhere to standard methods.
  • Assumption that the number of correct picks would ultimately correlate strongly with total points.
  • Failure to fully grasp the distribution in points that could be earned by a low seed scoring a huge upset or a mid-seed making a Cinderella run.
  • A genuine belief that, based on a perceived informational edge, favorites would earn outsized points this year.

Let’s dissect these one at a time.

Possibility #1: Lack of Understanding of Scoring Rules

On this point, there’s not much to say.  While we did our best to clearly explain the rules, we know that at least one person slightly misunderstood the scoring method.  Hopefully that was isolated, but with rules as unique as ours, it is likely that a few entries didn’t know or understand how their brackets would be scored.

Possibility #2: False Assumption that Number of Correct Picks Would Correlate with Total Points

The following chart says it all:

Figure 4. Total  Score and Number of Correct March Madness Picks, 2017

60 out of our 70 entries correctly picked between 32 and 42 games.  Furthermore, once a bracket hit that range, the marginal value of an additional correct pick didn’t mean much.

As one (admittedly cherry-picked) example, both our 5th place and 63rd place entries chose 32 games correctly.  Zooming the lens out, for our top 30 brackets by score, the correlation between the number of correct picks and total points was indistinguishable from zero.

This is exactly what we’d hope to see: the quality of the picks proved much more important than the quantity of winners.

Possibility #3: Failure to Grasp Distribution or Points by Team

Awarding points based on the inverse cumulative odds of a team accomplishing something remarkable inevitably leads to a skewed distribution of total points.  This is no different than traditional rules where getting the national champion correct is worth 32x as many points as getting a first round game right.  We have a similar skew, but in our tournament, we’re rewarding teams that outperform expectations.

Figure 5.  Points by Team, 2017

Because of this, our distribution of point by teams skewed heavily towards South Carolina, Xavier, and Oregon, who combined accounted for 59% of the total points available.  Adding in North Carolina only bumps that to 64%, emphasizing the importance of underdogs and Cinderellas relative to the National Champion. It should come as no surprise that the top of our leaderboard had significant exposure to South Carolina and Xavier specifically.

Possibility #4: Genuine Belief that Top Seeds Would Outperform Expectations

This is the only rational justification for choosing so many favorites.  However, there’s little data supporting the notion that your collective bias was information-driven.

How Prospect Theory Ruined Your Bracket

Ultimately, we believe that the biggest reason for your chalky brackets was deeply-rooted cognitive biases.

According to Daniel Kahneman and Amos Tversky, who won the Nobel Prize in Economics for describing the behavioural economics concept of Prospect Theory, fear of loss is about 2.5 times more powerful than lust for gains.  That is, in the absence of a strong informational advantage, you tended to stick with the consensus picks and Vegas favorites.

In addition, we must consider the notions of absolute versus relative success and failure.  Specifically, while entrants in our bracket challenge may have been anxious about their absolute performance, they would have been far more comfortable with so long as others followed similar strategies and performed equally.  After all, failing together is more “comfortable” than failing alone.

But by using a consensus strategy, it also becomes very difficult to win.  The contrarian behaviors that may lead to relative failure are also required to achieve relative success.  And because prospect theory dictates that failing alone is the worst possible outcome one can experience, in the absence of an informational edge, most people prefer the safety of the herd.

How Prospect Theory Ruins Your Portfolio

If the connection to investing isn’t yet clear, consider the plight of thoughtful, globally-diversified investors over the past couple years.  Sure, these investors have made money, but not nearly as much as their friends and neighbors with portfolios concentrated in US stocks.

Making things worse, a potent cocktail of home bias and equity concentration make US stock portfolios more the norm than the aberration.  As such, despite the fact that on an absolute basis everyone is winning, on a relative basis diversification has felt like a loser.

Times like these test the perseverance of investors who zig when everyone else is zagging.  But if the process is thoughtful and backed by solid data, then the short term results should be a far smaller consideration than the weight of the evidence.

This is true in your portfolio, and it’s true in March Madness.

How We’re Going to Use Prospect Theory In our 2018 Bracket Challenge

People respond to incentives.  The problem this year was that our incentive to pry you away from the herd wasn’t strong enough.  So that’s what we’re going to change next year.  Though we reserve the right to change our rules between now and then, it makes sense that we would make the payoff for correctly choosing an underdog equivalent not for the rational strategist, but for the irrational mind.

That is, what if we, after awarding points in inverse proportion to the cumulative odds (as we did this year), we further multiply the underdog’s points by a factor of 2 or 3?

We’ll spend our off-season thinking on it, but rest assured, we will never end our search for the incentive structure that meaningfully eliminates your pick biases.

And Lastly, the Final Results for Resolve’s 2017 March Madness Bracket Challenge

Congratulations to colonel717, who scored almost exactly 50% of the total available points. Look upon this bracket, all ye losers, and imagine what could have been.

Final Rank

Name

Points

1

colonel717

81.52

2

0761fabera

77.75

3

bshibe

76.63

4

DavefromBarrie

69.22

5

Purvinator

65.71

6

etfmike

65.04

7

Hinch

61.78

8

MrsBrick

61.57

9

mike.king

61.30

10

PatBolland

59.81

11

Mwkohler

57.30

12

geoff313

54.44

13

jd1218

53.41

14

acmeinvest

51.25

15

mackchiu

50.53

16

Sallese

49.64

17

DCSorley

49.16

18

Dragana

48.95

19

MCCRINDELL

48.37

20

archer

47.89

21

Oilerman44

47.67

22

bviveiros

47.42

23

trentd

47.16

24

shawkins

47.06

25

baylasdad

46.97

26

les sherman

46.51

27

glaschowski

46.48

28

mlederman

45.39

29

snadkarni

44.69

30

eharari

44.17

31

MrDuMass

44.10

32

Nick1

43.96

33

bigdawg24

42.92

34

PunterJP

42.78

35

Fenders10

42.64

36

jwood3010

42.63

37

HungryHungryHippos

42.43

38

CBurnell4

42.04

39

Pvolpe

41.84

40

CMRAM

41.67

41

cotts23

41.57

42

teamcatherine

41.28

43

KPeg15

41.00

44

ukbsktbll

40.82

45

csorley1

40.75

46

robbiep

40.73

47

RCMAlts

40.45

48

Crawling Chaos

40.23

49

pkatevatis

40.16

50

TheHotDogVendor

39.55

51

Walkush

39.19

52

Mnoack

38.60

53

mySphere

38.45

54

mattzerker

38.05

55

jcrich

37.32

56

Prewbee

37.05

57

brianm317

36.71

58

CANESPEED

36.64

59

sjsivak

36.41

60

Jgunter742

36.26

61

brennantim

35.29

62

jmkeil

34.89

63

scott.pountney

34.56

64

cosggg

28.99

65

Benc100

28.30

66

Cozzsmack

28.17

67

resolvetest

27.95

68

LDB

27.78

69

DanAdams

19.80

70

Dead-Brick

19.80