Microsoft Office Forums Queries to Return Highest WAR and Player's Name(s)?
 Register FAQ Search Today's Posts Mark Forums Read

#1
02-20-2020, 07:23 AM
 chasfh Windows 7 64bit Office 2016 Novice Join Date: Apr 2018 Posts: 9
Queries to Return Highest WAR and Player's Name(s)?

I am trying to devise a formula that scans a table and returns the batter with the highest WAR within a particular age range. For example, of all the batters who played from ages 22 to 27, which of them has the highest cumulative WAR for that range?

I have an Excel file almost 51,000 rows deep that lists more than 9,000 major league batters, who played at ages ranging from 18 to 42. There are hundreds of age ranges within, starting at 18-18, then 18-19, 18-20, etc., up to 18-42; then, 19-19, 19-20, 19-21, etc., up to 19-42; and so on, until we end up at 41-41, 41-42, and finally 42-42.

Each batter's play generates a number of WAR, as you can see on the sample I upload (please focus only on columns B, C, and D; the rest don't matter for my question). WAR is an acronym for Wins Above Replacement. That's not germane to what I'm trying to accomplish but it's context for you. I would need the exact same formula if I had been working with home runs instead of WAR (which I may, once I get this).

What I am try to determine through an elegant Excel formula is, which batter on the table generates the highest cumulative WAR total for each age range on the table. Not the highest WAR for an individual year within the range, but the highest multi-age total by any batter within the entire range.

As an example, let's consider the age range I offered in the original post: 22-27. I want an Excel formula that will scan the entire table and identify every batter who played the game anywhere within that age range (which is most of them). So anyone who played all six of those ages would qualify, as well as any batter who played in any if not all of those ages.

I happen to know there have been 7,828 batters who played during any part or all of the ages 22 to 27. This Excel formula would scan for all the batters in my table who fit that criterion, and then return that batter with the highest cumulative WAR total within: in this case, if the formula works right, it would return 54.9 as that highest number, and hopefully, another formula (using =FILTER, perhaps?) that returns Mickey Mantle as the author of that performance. (You won't get that on my sample table because I truncated it to upload the sample.)

I want to be able to apply the Excel formula to each of the 325 age ranges I am working with in part because I would like to learn how to apply this kind of formula to other queries I might be interested in doing; but, also, to avoid having to do 325 separate queries on this one website I can do age-range queries on.

I hope what I am looking for is clear, but if not let me know how I can help further clarify. I appreciate your help. Thank you!
Attached Files
 Best WAR by Age Range_sample.xlsx (132.0 KB, 4 views)
#2
02-24-2020, 10:08 AM
 chasfh Windows 7 64bit Office 2016 Novice Join Date: Apr 2018 Posts: 9

Hmm. Nothing?

Huh.
#3
02-27-2020, 02:15 AM
 ArviLaanemets Windows 8 Office 2016 Expert Join Date: May 2017 Posts: 551

In attached workbook, on sheet DataTable you enter your data. There are helper columns with differently colored headers - you can hide them;

The sheet Hidden contains a helper table, and as the name of sheet says, you can hide this sheet too;

On sheet Rep, you can select age range you want to get report for (which age is lesser or greater is not limited). Depending on selected age range, the max WAR for this age range and list of players who got max WAR in this age range + season of this result are displayed. I designed the report to be able to return up to 10 players listed - you are free to add or remove rows in player list table at will.
Attached Files
 BestWAR.xlsx (449.5 KB, 1 views)
#4
02-27-2020, 09:48 AM
 chasfh Windows 7 64bit Office 2016 Novice Join Date: Apr 2018 Posts: 9

Quote:
 Originally Posted by ArviLaanemets In attached workbook, on sheet DataTable you enter your data. There are helper columns with differently colored headers - you can hide them; The sheet Hidden contains a helper table, and as the name of sheet says, you can hide this sheet too; On sheet Rep, you can select age range you want to get report for (which age is lesser or greater is not limited). Depending on selected age range, the max WAR for this age range and list of players who got max WAR in this age range + season of this result are displayed. I designed the report to be able to return up to 10 players listed - you are free to add or remove rows in player list table at will.

Thanks for your effort in putting this together. It's not what I am looking for.

Your solution returns the player with the highest single season WAR within that age range. I am looking for a solution that returns the player with the highest cumulative WAR across the several seasons of that age range.

See my example above: if I were to select the age range 22 to 27, your formula would return Babe Ruth at 13.9 (at age 26), since he was the player within that age range with the highest ever single season WAR. I am looking for the solution that returns the player with the highest multiple-season cumulative WAR for that age range, which would in this case be Mickey Mantle with 54.8 (i.e., adding 6.7 at age 22, 9.8 at age 23, 11.5 at age 24, 11.4 at age 25, 8.8 at age 26, and 6.6 at age 27).

Any idea whether I can get to that solution by tweaking your work?
#5
02-28-2020, 12:39 AM
 ArviLaanemets Windows 8 Office 2016 Expert Join Date: May 2017 Posts: 551

I think Rep2 does what you want.

Btw, Mickey Mantle example data didn't contain age 27. So Ty Cobb will have max total WAR for age range 22-27 instead.

The season in report is earliest one for this player in age range. You can modify formula for column R2NameWarSumRow to get player seasons reversed (...[Season];">=" & [@Season]...) - then latest season in age range will be returned instead.

To get all players with max total score and all their seasons in selected age range listed in report, additional helper column(s) is/are needed (you need to number all rows for all players with max total WAR in some order). When you want this, than the example must have given enough clues for you!
Attached Files
 BestWAR.zip (563.6 KB, 1 views)
#6
03-09-2020, 03:53 PM
 chasfh Windows 7 64bit Office 2016 Novice Join Date: Apr 2018 Posts: 9

Quote:
 Originally Posted by ArviLaanemets I think Rep2 does what you want. Btw, Mickey Mantle example data didn't contain age 27. So Ty Cobb will have max total WAR for age range 22-27 instead. The season in report is earliest one for this player in age range. You can modify formula for column R2NameWarSumRow to get player seasons reversed (...[Season];">=" & [@Season]...) - then latest season in age range will be returned instead. To get all players with max total score and all their seasons in selected age range listed in report, additional helper column(s) is/are needed (you need to number all rows for all players with max total WAR in some order). When you want this, than the example must have given enough clues for you!
I'm sorry it's taken me so long to get back to you. Your solution got me significantly closer, but it's too unwieldy to be practical.

Your solution actually does work, as long as I change the formula in B4 on your Rep2 tab from SUMIFS to MAXIFS. That's because every player season on the DataTable tab sums up the total WAR that player has and places it in each season, so using SUMIFS gets you as many multiples as seasons you are querying. That fix was simple.

The problem occurs when I expand the DataTable from the 4,000 player seasons I gave you in the example to include the full universe of player seasons, some 83,000+. It balloons the calculations such that if I simply change dates and recalculate, it takes well over a minute to calculate the result. Even with 32GB of RAM and a CORE i7 9th Gen processor, it must be going through zillions of calculations just to get the answer.

I really appreciate the work you put into this to help me get the answer. Thank you.
#7
03-10-2020, 12:28 AM
 ArviLaanemets Windows 8 Office 2016 Expert Join Date: May 2017 Posts: 551

In Excel:
1. More complex formula in cell means more time to calculate the result for this cell. This time also depends on type of (sub)formulas used.;
2. Total time needed for calculations in table is sum of calculation times for every formula in table.

This taken into account, there is no way for fast calculations with your current setup and nearly 100000 records in table.

I'd consider creating a SQL Server database , and a Job there, which:
1. E.g. every night reads info from your table and refreshes a copy of it in SQL Server database;
2. After that, the Job runs a stored procedure in SQL Server database, which calculates all needed statistics and stores them in separate table in SQL Server database.

Now, in your Excel table, you create a query which reads the statistics table from SQL Server database, and returns the result as table in another sheet. So long as you are content with results at 'Yesterday', you get them whenever you open the workbook.

When you don't have full SQL Server, you can get SQL Server Express for free. The caveat is, no Jobs in Express - you have to use Windows Task Scheduler to run task which runs stored procedures in SQL Server database to read data, to do calculations, and to update statistics table.

 Thread Tools Display Modes Linear Mode

 Similar Threads Thread Thread Starter Forum Replies Last Post Daav Word 2 08-08-2019 02:44 PM ballpoint Excel 1 02-23-2018 11:18 AM am_gup123 Excel 2 03-23-2017 01:56 AM frankiefrankiefrankie Excel 2 09-27-2016 08:02 AM ibrahimaa Excel 3 04-23-2012 07:52 PM

All times are GMT -7. The time now is 01:58 PM.

 -- Default Style -- Lightweight -- New Mobile Contact Us - Privacy Statement - Top