Two More Brackets: Strong Predictors vs. Strong Predictors and My Instinct

Who knew analytics could be so much fun? I write about them for a living, but applying them to sports has made them come alive for me. As a result, I took the advice of the Watson Analytics March Madness Prediction worksheet I created and decided to base another bracket on what the worksheet says is a stronger predictor based on Pomeroy’s data: the combination of tempo rank and Pythagorean rank.

WAP2

Watson Analytics provided even stronger combinations, but they were all related to how losses drive wins, so I had to discard them. Anyway, 71% strong was good enough for me.

The result of applying this combination to the bracket was a lot more interesting than using Pythagorean rank alone. I now have a better understanding of why there’s a trend of a 5 seed losing to a 12 seed. My problem is that I haven’t been able to figure out which 12 seed. However, according to my prediction worksheet, Wofford’s combination of tempo rank and Pythagorean rank was more likely to lead to a win than Arkansas’s combination, so that’s where that upset could come.

I did have to make some judgment calls. Based on the statistics, Kentucky is one of four teams that have the combination most likely to lead to wins. The other three are Villanova, Gonzaga, and Notre Dame. This means that Notre Dame and Kentucky must meet in the Final Four. I went ahead and picked Kentucky to win based on the fact that Kentucky is undefeated and John Calipari’s NCAA tournament resume.

I’m also sad to report that Harvard’s combination of tempo rank and Pythagorean rank was better than UNC’s, so apparently my karma for enjoying Duke’s early exits has come due.

There were other surprises, such as UCI’s much better combination than Louisville’s. Also, the analysts who have been saying that Utah is most likely to challenge Duke are correct; Utah’s combo is just a tad less winning than Duke’s. Interestingly, Duke will be no match for Gonzaga when they meet. I’m hoping that because this is a statistical bracket, Gonzaga will make it far enough to play Duke, because usually when I have them going far, they lose early or if I have them lose early, they go far. This. Happens. Every. Single. Year. Enough whining. Here’s the bracket.

Midwest

Midwest 2

West

West2South

South2East

East2

Final Four and Champion

FF2

And finally, I decided I would also take the Watson Analytics information and combine it with some other factors such as the team’s coach’s tournament resume and my personal bias. I’m exhausted from all this cutting and pasting, so I’m just going to show my Final Four and Champion from that bracketFF3. If it turns out this bracket is more of a winner than the others, I’ll post it at the end of the tournament.

I would be remiss if I didn’t mention that my friend Mark Buerger has suggestions for people who would like human help with their brackets. (Hi, buerg!)

Of course, if the Tar Heels would like to defy all statistics and win it all in memory of Dean, I would be content.

 

 

How I Decided to Take a Different Tack for my Brackets This Year

In 1997, I won my company’s March Madness pool and that’s the last time that one of my brackets did not turn into a sea of red by the end of the first weekend. I blame Gonzaga. Since they started getting into the tournament, they never do what I predict and the downward spiral begins. So, this year I decided to avoid going with my gut or listening to analysts, since the results have been disastrous. It is time for drastic action; it’s time to be more scientific.

My decision to try a new process coincides with my work with IBM’s cloud analytics service that anyone can use for free (with limitations) just by registering at www.watsonanalytics.com. You can upload a spreadsheet of data and get information about it with the service, which is called Watson Analytics. I’m a subscriber to Ken Pomeroy’s data (http://kenpom.com/), so I decided to use his data combined with win-loss records and upload it into Watson Analytics to help me fill out my NCAA bracket.

I uploaded the data and decided to use the predict function to see what most influenced wins. I decided to do a very simple analysis where I would simply see what 1 factor was most likely to affect wins based on Pomeroy’s data. After discarding the information that losses were a strong predictor of wins, I found that Pomeroy’s PythagRank was the next strongest predictor of wins.

WinsSo, I filled out my bracket based on Pomeroy’s Pythagorean Ranking. It’s not full of huge surprises, but it’s interesting and it has teams going far in the tournament that I would not have picked on my own. So maybe I’ll have more success with this one. I’m also considering creating another bracket based on a combination of factors such as RankPythag and RankTempo, which Watson Analytics tells me is an even stronger predictor of wins.

For now, I’m going to post screen shots of my bracket where the winners of each game was picked based on Pomeroy’s Pythagorean rank, and I’ll return after the first four and next round of games happen to see how this bracket is doing.

Midwest

MidwestEast

EastWest

WestSouthSouth

Final Four and Champions

FF

 

 

 

 

 

 

I would be remiss if I didn’t mention that my friend Mark Buerger has suggestions for people who would like human help with their brackets. (Hi, buerg!) After all, the human element is rarely more evident than during March Madness.