Announcing the FightMetric Research Contest
[UPDATE 3/11/2011] The contest will close on Monday, and we'll have a winner on Tuesday. I urge you to check out PistonHyundai's Distribution of Win % Across Reach Differences and Terrence Chan's Low Hanging Fruit. And, of course, keep researching!
How much do you really know about MMA?
That's not an insult or a challenge to a trivia contest. The kind of knowledge we're referring to isn't about memorized facts, but the basic underpinnings of the sport. How much does anyone really understand about MMA?
Despite what we think we know, there are simple and fundamental questions about the sport that remain unanswered. And these are not even the deep and difficult questions that require complex mathematical formulas or specialized analysis. The questions we have in mind are ones that a group of smart MMA fans with access to the proper data should be able to answer with little trouble, it's just that they've never been given the chance.
To this end, FightMetric has joined forces with Bloody Elbow to create a Research Challenge, bringing the right data to the right people and giving you a chance to contribute to the advancement of MMA. We'll pose a single question to the Bloody Elbow community and provide a simple Excel spreadsheet with the relevant data. From there, we're asking you to look at the data, do some analysis, and write-up your conclusions in a FanPost (make sure to put the words "Research Challenge" somewhere in the title of the FanPost). The most compelling entry will be selected as the winner and promoted to the front page. Whether you choose to create a grand thesis or just make observations about a few interesting data points, you'll all be contributing to the discourse that helps further our undestanding of our favorite sport.
The first question we'll be looking at:
Does reach matter in MMA?
The answer might seem obvious given recent fights like Torres-Banuelos, where the fighter with the overwhelming reach advantage was able to jab his opponent into oblivion. But that answer only makes sense in the context of matches spent at striking distance. In MMA, the disadvantaged fighter can get to the clinch or the ground and the reach advantage is negated, and potentially a liability. To find out if reach confers a meaningful advantage in all cases, some cases, or no cases, we'll have to look at the data.
The provided spreadsheet contains information for nearly 1500 fights from the UFC, WEC, Strikeforce, EliteXC, and Affliction.. To reduce distraction when analyzing the data, we've removed all identifying information about the fights and fighters. The focus is simply on the winner's reach measurement, the loser's reach measurement, and the method of victory (draws and No Contests are not included).
Be thorough, be creative, even be contrarian if you like. There is no single correct answer here. It's not necessary to concoct the final words on these topics, just the first words.
Good luck and happy data hunting!
The FanPosts are solely the subjective opinions of Bloody Elbow readers and do not necessarily reflect the views of Bloody Elbow editors or staff.
83 comments
|
13 recs |
Do you like this story?
Comments
This is a very cool idea!
Why I never joined a frat: http://www.youtube.com/watch?v=K-KNVrZaN8M
"Don’t quote old fucks to me" – Brent Brookhouse
"A samurai would bite your cock off if you tried that shit on the battlefield." - Kid Nate
Quick correction – unless I’ve got an incomplete spreadsheet, it “only” gives data for 1486 fights.
None more gangster.
Tweeter!
Ha, we've clearly all been duped again!
Download blocked from work, but excited to look over the data from home. Cool idea indeed!
"If at first you don't succeed, destroy all the evidence that you tried"
It says "almost 1500"
In other words, you’re dealing with the whole thing
Reach in MMA: How Bones Will Beat Shogun
I bet we see a few of those.
Awesome topic, will look into it more when I’m not mobile
by rkilla on Mar 8, 2011 1:12 PM EST via mobile reply actions
Cool idea!
Unfortunately I won’t have time to make a real thorough analysis :( But I’m confident others will, really looking forward to the outcome!
We’ll provide the paint, and the fence, and all you have to do is get to painting. We’ll select a couple of the best slats and feature them near the front of the house…
I keed. This is great.
Tatum: I think he's a good man. I like him. I got nothing against him, but I'm definitely gonna make orphans of his children.
"Oh come, now, you don’t mean to let on that you like it?"
"Like it? Well, I don’t see why I oughtn’t to like it. Does a boy get a chance to whitewash a fence every day?"
…
"Say, Tom, let me whitewash a little."
"Someone is WRONG on the internet. What do you want me to do? LEAVE? Then they'll keep being wrong!"
-Randall Munroe
by pdl on Mar 8, 2011 3:41 PM EST up reply actions 1 recs
I'll start with the low-hanging fruit
The easy stuff:
- The average winner has a reach advantage of .23755 inches. Or 0.33% of total length. The average reach of all fighters in the data is 72.778 inches. The longer fighter has a record of 688-641, or wins 51.8% of the time.
- When there is a difference of five inches or more, the longer fighter is 150-106 (58.6%).
- The average length of a fighter who wins by KO/TKO (referee or doctor stoppage) is 0.58" longer than that of his opponent. The longer fighter is 244-189 (56.4%) to win. Fighters who enjoy a 5" or more reach advantage are an impressive 59-27 (68.6%) winning by KO/TKO. We can draw a significant conclusion that being longer correlates to winning by knockout.
- The average fighter who wins by submission, however, is only 0.27" longer (btw at this point my use of the word “longer” without identifying the context is starting to make my inner 8-year-old snicker). Not only that, but the longer fighters are actually sub-.500 at 174-203 (46.2%). Fighters with a five inch or greater reach however are 44-29 (60.2%) in the submission department, so if you’re a submission grappler, it does seem helpful to be much longer than your opponent, but not of much help to be just a little longer.
- Somewhat stranger, the shorter fighters are more likely to win a decision. The longer fighters are 231-265 (46.6%). Maybe the shorter fighters tend more to be top-position wrestlers who are grinding out more decision wins. Or it could be the other way around: fighters who know they are giving up reach realize they need to get inside and take the fight to the ground.
by Terrence Chan on Mar 8, 2011 1:20 PM EST reply actions 5 recs
You have to throw out decisions, they are too subjective to be of any use statistically. TKO and Subs are things that fighters can control, hence what need to be looked at
You gotta pay the troll toll, to get into this boy's soul.
twitter.com/JayAreW
As erratic as the judges have been, I’m pretty sure through 496 fights you can wade through the static.
by Terrence Chan on Mar 8, 2011 1:35 PM EST up reply actions
You're giving away the answers!!!
They made a video game about Yakuzas. It’s called Yakuza. And it’s about Yakuza
gocyborg.wordpress.com
You win
█♣█
A wise man told me don't argue with fools
Cause people from a distance can't tell who is who -- Jay-Z
Somewhat stranger, the shorter fighters are more likely to win a decision. The longer fighters are 231-265 (46.6%). Maybe the shorter fighters tend more to be top-position wrestlers who are grinding out more decision wins. Or it could be the other way around: fighters who know they are giving up reach realize they need to get inside and take the fight to the ground.
Or: Reach is correlated with height. Fighters with shorter reach are also shorter in height. I.e., lighter weight classes. Lighter weight classes are more likely to go to decision.
Nice breakdown.
That’s another consideration I had.
Mike, any chance of getting weight class info on this spreadsheet? It’s relevant in that bigger guys are more likely to knock each other out, and in the heavier weight classes you’re more likely to have a large reach advantage due to the bigger size range in heavyweight (see: Roy Nelson [74"] vs. Stefan Struve [83"] , Brock Lesnar [81"] vs. Cain Velasquez [77"], Shane Carwin [80"] vs. Neil Wain [72"], or Fedor [74"] vs. Bigfoot [82"] for examples).
None more gangster.
Tweeter!
Now this is intriguing.
It would have been nice to have dates on the fights though. I think it might be possible reach has more of an impact in more recent times.
"Who are you and how the hell did you get in here?"
"I'm a locksmith... and i'm a locksmith."
No, I don’t see how reach difference would change 5-10 years ago compared to now. I can see how a fighter with a reach disadvantage/advantage would improve over time but that’s just dealing with individual fighters.
I tend to be biased towards strikers . . . exciting strikers.
- - - - -
VEe is ANIMated!
by VeeisAnimated on Mar 8, 2011 2:59 PM EST up reply actions
That’s what i mean. MMA is just a bunch of individual fighters, and as the general level of skill and technique increases across the board over time as it has, i’m thinking there is possibly a trend of these individual fighters learning to better implement their reach advantage.
"Who are you and how the hell did you get in here?"
"I'm a locksmith... and i'm a locksmith."
Conversely, wouldn’t shorter fighters learn how to make up for the deficit in reach?
Twitter: @Mike_Fagan_13
When you mentioned dates I’m thinking in terms of active fighters back 1999-2000 to the active fighters now.
I’m also thinking about a fighter like Fedor who apparently knew how to deal with a reach disadvantage early in his career.
I tend to be biased towards strikers . . . exciting strikers.
- - - - -
VEe is ANIMated!
by VeeisAnimated on Mar 8, 2011 3:18 PM EST up reply actions
Submissions!
Let’s break down the sub data.
When you think about guys with long limbs, you think about triangles. And the data bear this out. The longer guy has triangled the shorter guy 28 times, while the shorter guy has only wrapped his legs around the longer guy’s neck 17 times (62.2%). The reach advantage for the typical winner by triangle is a full inch (1.05" to be precise).
At the other end, submissions that tend to be associated with shorter, thicker guys are the arm triangle and guillotine. The arm triangle holds true, with the shorter guys holding an 18-13 (58.1%) advantage. But the guillotine is still slightly the realm of the longer guy, by a margin of 44-37 (54.3%).
The good ol’ rear-naked is of course the most common submission in MMA, if not all of grappling, and here the long guys still hold the edge, 63-49 (56.3%).
Shorter guys have a perfect record of 5-0 when you look at wins by kneebar, which I think just tells you that Rousimar Palhares is shorter than everyone he’s fought.
by Terrence Chan on Mar 8, 2011 1:32 PM EST reply actions 5 recs
sorry...
Not a very frequent commenter; BE is usually a read-only interface for me, sorry about that. :)
by Terrence Chan on Mar 8, 2011 1:38 PM EST up reply actions
No need to apologize. This is good stuff, and I just don’t want it to get lost here.
Twitter: @Mike_Fagan_13
Found some errors in my original post anyway, so might as well fix it up and make ’er look all pretty.
by Terrence Chan on Mar 8, 2011 2:08 PM EST up reply actions
Rec’d for the Palhares line.
When a true genius appears in the world, you may know him by this sign, that the dunces are in a confederacy against him. - Jonathan Swift
Editor, HeadKickLegend.com
Contributor for CagesideSeats.com and Bloody Elbow Radio
Still Subo at Fightlinker.com
by Derek Suboticki on Mar 8, 2011 2:48 PM EST up reply actions
Does the advantage begin to taper (or even flip) with excessive reach advantage?
I ask this because among the heavyweights you have some who are awkwardly tall, and at the lower weight beanpoles may be cutting excessively. This is completely anecdotal, but I feel that in my viewing experience excessively tall fighters don’t fair well, or as well as more reasonably tall fighters (i.e. 6-2 vs. 7-0, or 5-10 at bantam weight).
This should produce some cool stuff.
Great idea!
once again, a most concrete example
why bloody elbow is the class of the MMA news community.
Outstanding project!
will study and post.
"I hate these quotes: Rise And Shine or Rise And Grind....just rise and shut the fuck up" -Phil Baroni
by stainlesssteel on Mar 8, 2011 1:48 PM EST reply actions 1 recs
You haven’t provided enough info. Asking if it matters in mma is such a broad question. Are we talking about mma on the elite level, on the average skill level throughout all competitive mma events, or theoretical, perfectly executed mma? We haven’t been told what level of competition the data was taken from either.
Updated the post to reflect that. Data comes from the UFC, WEC, Strikeforce, EliteXC, and Affliction.
Twitter: @Mike_Fagan_13
Wonder if cage vs. Ring affects reach advantage.
Ring always seems to favor strikers, while cage help ground and pound.
█♣█
A wise man told me don't argue with fools
Cause people from a distance can't tell who is who -- Jay-Z
How so? The fight begins in the center of the cage, if the outcome is determined quickly then the cage or ring doesn’t influence anything.
Cage can help ground and pound if a fighter is able to get the takedown. I’m thinking about Couture’s tactics to negate Vera’s reach and striking advantage by simply quickly engaging in the clinch and pinning him for minutes against the cage.
I tend to be biased towards strikers . . . exciting strikers.
- - - - -
VEe is ANIMated!
by VeeisAnimated on Mar 8, 2011 3:21 PM EST up reply actions
There are several factors
Ring has corners which allow fighters to be penned in and evade less effectively, and cage provides additional opportunities for takedown because it’s less risky to throw someone against a cage than it is to throw someone against ropes (see Bas Rutten vs. Frank Shamrock for an example of how using takedowns in the ring is more risky).
84.5 reach
I’m not even gonna attempt because I suck at numbers/data, but damn man who has an 84.5 reach advantage? Struve is said to have 83 and Bones is what, 81.5 or something like that. I’ve seen this measurement pop up quite a few times and all on the winning side, who could it be? Alistair???
Bones has the 84.5 inch reach.
"Cancel my subscription to the resurrection, send my credentials to the house of detention" - Jim Morrison
by LRaunThaDamaja on Mar 8, 2011 2:15 PM EST up reply actions
Ok thanks you guys
I guess my lazy ass could’ve checked Bones’ measurement to make sure. I saw a couple of wins next to the 84.5 reach by way of guillotine choke and kinda ruled Bones out seeing as how a handful of his subs were modified a bit (Jake O’Brien and Bader) so I wasn’t sure if he had any choke subs listed as official guillotines or not, again I could’ve easily found that out, but *eh fuck it.
What’s Overeem’s reach btw???
by SentientAndroid on Mar 8, 2011 2:52 PM EST up reply actions
Once I import this into a database...
Ya’ll be fucked.
MatLab links to those warehouses
and lets you manipulate them with what is essentially c++ for dummies.
Great tool for non-programmers who need to program, definitely worth checking out.
I’m a big fan of analytics, but this just seems like someone coming up with a hypothesis (reach matters in MMA) and expecting someone else to do the grunt work to comb through the data. If you want to create a project like this, which I fully support in theory, you should make it more open ended.
I won’t be taking any credit for any of the work being done (besides my own, of course). I don’t want to speak for Rami, but I would wager he’s of the same mindset.
There’s much more at play here than answering a question about the effect of reach in MMA. If things work out like I hope, this is the beginning of cultivating a proto-sabermetric community in MMA. As I told Rami when he approached us with the project, this is something I’ve been dreaming about since I heard about FightMetric a few years ago.
Twitter: @Mike_Fagan_13
Would love to see this in the context
Of ring vs. cage. I’m at school so I can’t run the file, but is this data included? Maybe which orgs are included in these data?
concerning reach
how exactly is reach measured? is it fingertip-to-fingertip or fist-to-fist.Neither of these shouldn’t really be called reach, they should be called wingspan since it includes the chest in the measurements, as some chests (like Bader) make up a good portion of their ‘reach’.
I’ve seen a couple ‘Tale of the Tapes’ from HDNET that show reach being from the shoulder to the fist, indicating true reach and the advantage.
Who's the only one here who knows illegal ninja moves from the government?
I believe they measure fingertip to fingertip
so yes, wingspan would be an accurate term (if you leave aside the fact that the fighters generally have arms, not wings :) Really the question is how far away can a fighter have his face from the other fighter and still hit him. So if one is turning into a jab in classic boxing style, some of the shoulder counts as ‘reach’. So from trapezius to end of closed fist?
I consider myself a softcore fan.
I think the best possible measurement
tip of knuckle to the acromioclavicular joint (easy to find, as opposed to the rotator cuff which would be the most precise measurement of arm length). A lot of guys with broad shoulders are at a disadvantage when it comes to straight punching.
That said, broad shoulders do give an edge in leverage for throwing hooks and overhand punches. That’s why a guy like Mike Tyson, who wasn’t particularly tall or long-armed (listed as 71 in reach according to Wikipedia) could be so devastating. He slipped jabs well and closed the distance to throw tight hooks where he had the edge in power. The same is true of Fedor (he was giving up 6.5 inches against Brett Rogers and he KO’d him just fine). Stephen Struve vs. Roy Nelson is another example.
So while it’s true that having a reach advantage can give someone the edge if both fighters are throwing straight punches, perhaps it’s just fighters not training the right way for countering the long jab. It can be extremely effective but if a shorter guy can slip the jab he should be able to fight someone who is just longer just fine. Look at how befuddled Nate Marquardt looked against Okami. He is used to being bigger than his opponent so he looked like he’d never slipped a jab in his life.
measuring wingspan seems a deceiving stat when discussing reach
Using the fingertip to fingertip method of measuring, Bones 84.5 might be 8 inches in fingers alone.
Who's the only one here who knows illegal ninja moves from the government?
Ohh.. I often look to bloodyelbow in those few hours I get a break from being a slave to excel. How dare you do this!
Ohh.. I often look to bloodyelbow in those few hours I get a break from being a slave to excel. How dare you do this!With that said, I will happily participate, statistics and mma are very intriguing to me.
Awesome idea
Hey guys, I see some of the excel guru’s have already commented here with some excellent stuff. However, there are several comments about what info is relevant or not. Any good stats guy will tell you, give us all the data you can and we’ll work out whether it is relevant or not. I’d love to see regressions, etc. on this stuff. primary fighting style is highly interesting, years of experience, weight class, number of previous wins, etc. Without that information, I don’t think we can get the full picture. That being said, here are some additional insights I think we can get:
METHOD:
A. I modified the data to simplify the “Method” into Primary Method (KO/TKO, submission, DQ, and Decision) as well as secondary method (Submission-Guillotine, etc.)
B. I did as other guys have, subtracting the difference between winner and loser
C. I pivot tabled the data.
RESULTS:
If you sum and subtract:
- Short arms win more decisions (and lets be honest, there are a lot of wrestlers with short man syndrome. I bet if we had that stat, it would be a high correlation).
- Long arms win by KO/TKO and Submission more often as well, a difference of 287 and 117 respectively in total reach length. These numbers could be highly skewed by one or two fighters with multiple fights and massive reach advantages (ie, Silva, Jones, etc.). 10 fights from Anderson Silva could accomplish most of this negligible advantage.
- Submissions – Reach gives a minuscule advantage to RNC, Triangle and Guillotine. However, we have 9 fights that short arms win ankle locks, 5 by kneebar, and 21 by strike submissions. That stat tells me ground and pound which smells of wrestlers.
- Extreme difference – When there are extreme differences there also seems to be a correlation to wins. There are 344 wins from people with 3+ inch reach advantage, and only 274 from people with 3- reach advantage or greater.
by b_radical on Mar 9, 2011 9:05 PM EST reply actions 1 recs
One general thought about the data
It would be good to cite cases where a particular tall or short fighter contributes heavily to the KO or sub count. Given that the tall guy advantage is significant, but not huge, it seems just as interesting to identify where particular fighters use their physical stature to an advantage. Then, once those fighters are identified in the statistics (and named), it would be interesting to review their fights, their discipline, and whatever other variables seem significant.
I say this because MMA is such a nascent sport that individual styles may be making as much difference as physical attributes. I mean, look at Frankie Edgar, whose smallness seems to be an advantage in many cases, but who also fights with a difficult to replicate style.
I started playing around with this, with the intent of performing some regressions. However, I ran into some issues. I’ll start off with some background.
To oversimplify a little, what we’ve seen so far are basically mean comparisons of winners’ and losers’ reach. Ok, this is a big oversimplification – several have provided some very intriguing and nuanced breakdowns. But ultimately, the structure of the data has encouraged all the posts I’ve seen to look at mean reach or mean reach differentials. This is useful, but it doesn’t quite map on, in a logical sense, to the process we think is occurring. I.e., what we think is happening is that reach advantage is influencing – i.e., causing a positive change in – the probability of winning. With mean comparisons, the implied causal direction is backwards. We are categorizing fighters into winners and losers and “predicting” reach differential. Regression frameworks map more closely on to the logic of the likely underlying causal model, and they’re also better suited for predicting the future. For example, ideally, we would take reach differential and use that to predict the probability of someone winning. Such a model would enable us, for example, to quantify the exact increase in the odds of winning that Jones’ reach advantage gives him over Rua. Note the outcomes in question are actually categorical (e.g., dichotomous win vs. loss; nominal method of winning). So this is a second sense in which particular categorical regression frameworks (logistic or multinomial) map more closely on to the likely underlying process. Having said all this, associations are associations, regardless of direction. My points thus var solely concern model conceptualization and aesthetics.
So we want to run a logistic regression model predicting fight outcome (win vs. loss) from reach differential. Let’s run through the early steps. The database provided has variables for winner’s reach and loser’s reach. We take the difference. Now, we need a variable corresponding to wins – e.g., coded 1 for wins and 0 for losses, to prepare for a logistic regression analysis. We can’t just assign a win any time a fighter in column A wins, because that column is defined as winner’s reach. I.e., they win 100% of the time, which means our outcome would not vary (zero variation implies zero covariation with other variables – i.e., regression models, which detect covariation, cannot be fitted). Likewise for loser’s reach – they win zero % of the time.
We could try taking another approach. E.g., we could restructure our data so that fighters with the longer reach were always in column A, and fighters with the shorter reach were always in column B. We could then 1) calculate the difference between these two variables as a third variable (which will always be positive), and 2) assign 1’s and 0’s to a fourth variable, corresponding to whether the fighter with the longer reach won. Now, we’ve got an outcome that shows some variation. But step back a moment and look more closely at step 2 above. What do we do when the fighters’ have equal absolute reaches? We have no basis upon which to assign an outcome.
Now, let me restate the penultimate sentence from an earlier paragraph above: “ideally, we would take reach differential and use that to predict the probability of someone winning.” With respect to the provided database and the issues discussed above, the question is: the probability of WHO winning? To calculate this variable, we have to have some basis on which to assign one person as “winner” and another person as “loser.” The only information in the provided database that provides this basis – this “anchor” – is reach differential. When there is no reach difference, there is no basis upon which to distingish winners and losers. The fact that the “winner” appears in “Column A” is arbitrary and meaningless.
Alternative strategies are unsatisfactory. E.g., models could be fitted if we throw out cases with equal reaches. Throwing out that much relevant data would be kinda silly. Alternatively, we could assign random numbers to each case, sort, and use the resulting order to determine which column takes priority in the regression model. This is unsatisfactory on the grounds that we are unecessarily introducing a source of random error into our model, one that will undoubtedly perturb results from one sample to the next. We’d like results that are as generalizable as possible to different samples.
In sum, the core issue with the structure of the data is that there isn’t enough information to estimate regression models in a principled way. Additional variables – that show some non-trivial amount of variability – would enable regression models to be fit for all cases, because it would establish a baseline, objective win probability. To take a somewhat absurd example to drive home the point: even if we had something completely unrelated – like color of shorts, this would provide an objective basis for assigning wins or losses to all cases. We can then say, ok, people with darker colored shorts win 50% of the time. How much does reach difference influence this baseline win probability?
My recommendation is that in the next round, include some additional variables. Incidently, the more the better: As noted by others, more variables will also enable more sophistated models or crosstabulations that tease apart many potential confounding effects, such as weight class.
For a discussion of a closely related issue, see:
http://www.uvm.edu/~dhowell/StatPages/More_Stuff/icc/icc.html
This author’s example concerns cases in which there is no sensible ordering between a predictor and an outcome. This is not the exact issue in our example, so unfortunately his solution (use of something calld the ICC) does not apply to our case (note also that he discusses but ultimately rejects a resampling procedure conceptually similar to one of the unsatisfactory approaches I discussed above). The best solution for our case is to include more variables.
technically,
I’m making a suggestion for the next database that’s sent out, not adding any new analyses. It’s also a suggestion others have made. I just provided another reason. So I didn’t think it was quite fanpost-worthy.

by 

















