FanPost

Research Challenge: Eyeballing it

I am not a statistician.  I am someone who find scatterplots informative.

All I've done here is take the FightMetric data and compile it by reach advantage, which I'm defining as the difference in reach between the winner and the loser.  Naturally, this results in a negative reach advantage in cases where the longer fighter loses.

Within any given reach advantage, the number of winning results is compared to the number of total results to produce a Winning Ratio.  This is often called the winning percentage, but it's not a percentage, and I'm pedantic like that.

Then I dropped these two values into a scatterplot, plotting Reach Advantage against Winning Ratio.

Fightmetric_2bbase_medium 

You can obviously see a significant correlation there, but if you ignore the trendline you'll notice the distribution gets a little flat closer to a Reach Advantage of zero.   Obviously, there are many more fights contested with smaller Reach Advantages (advantages over 10" are very rare), so I think what we're seeing is the increased signal at smaller (closer to zero) Reach Advantage values drowning out the noise we see at less common Reach Advantage values.

Let's see if we can refine that by dropping all the Reach Advantage values represented in only one fight.

 2_2bfight_2bminimum_medium

That's a flatter line.  We've only removed seven Win events from the data, but that's a visibly flatter line.  I'm sure someone can (or has) calcuated the r-squared values, but even without them we can see that this is a flatter line.

But there is still clearly some noise in the signal, as we have a few perfect records still scattered through the dataset.  Let's try again, raising the minimum from two fights to three.

3_2bfight_2bminimum_medium

Now we've removed 12 more fights from the dataset, and the line got steeper again (I foolishly changed the scale of this chart, but if you extrapolate the line out it is clearly steeper).  So some of that statistical noise was actually hiding a stronger correlation.

Can we show a causal relation?  No.  But these charts do offer some justification for the difference in reach being called an "advantage".

--

--

I noticed that a Reach Advantage of 0 has a perfect record in all of these charts, but that's obviously not true.  If you remove that datapoint (as I had intended before I made these charts), it has a small effect on charts 2 and 3, making the trandline very slightly steeper in both cases.

\The FanPosts are solely the subjective opinions of Bloody Elbow readers and do not necessarily reflect the views of Bloody Elbow editors or staff.

X
Log In Sign Up

forgot?
Log In Sign Up

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior users will need to choose a permanent username, along with a new password.

Your username will be used to login to SB Nation going forward.

I already have a Vox Media account!

Verify Vox Media account

Please login to your Vox Media account. This account will be linked to your previously existing Eater account.

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior MT authors will need to choose a new username and password.

Your username will be used to login to SB Nation going forward.

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Bloody Elbow

You must be a member of Bloody Elbow to participate.

We have our own Community Guidelines at Bloody Elbow. You should read them.

Join Bloody Elbow

You must be a member of Bloody Elbow to participate.

We have our own Community Guidelines at Bloody Elbow. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_5349_tracker