To see the original Github Repository, click https://github.com/h-tu/cs320final
You can find all the data, rmd file and this html in this repository.
Basketball is created by Canadian physical education instructor James Naismith in 1891. As time goes, the rules keep changing and the popularity grows a lot. Today, basketball is one of the most popular sports around the World. For more information about NBA, check https://en.wikipedia.org/wiki/National_Basketball_Association. NBA represents the highest level of the basketball. We have seen a lot of greatest players in history of NBA, like Bill Russel, Wilt Chamberlain, Magic Johnson. Larry Bird, Michael Jordan, Hakeem Olajuwon, Shaquille O’Neal, Allen Iverson, Kobe Bryant, Lebron James. But today, NBA begin to change and focus more on three points shooting.
In last six seasons, Golden State Warries won three championships and accessed to five finals. It can be said they are the most dominate team in the NBA. A big reason for their rise is the “deadly” three points shooting by “Splash brothers” Stephen Curry and Klay Thompson. But if you are watching NBA in 2000, you will not believe that three points shooting will become that important. In that time, NBA was dominated by great centers like Shaquille O’Neal.
The offensive style changed a lot in today’s NBA. Back in 1999, the Spurs were using 88.6 possessions per 48 minutes according to Basketball-Reference.com. In 2017, Golden State Warriors used 102.24 possessions per 48 minutes. Both of those teams won the title in those respective years. With a faster pace, that means there’s more points scored across the league and the 3-point ball has a lot to do with that.
One of the greatest coaches of all time Gregg Popovich said “Everything is about understanding it’s about the rules of the league and what you have to do to win. And these days what’s changed it is that everybody can shoot threes.”
As said in the introduction, NBA has changed a lot of its offense and defense, every team played faster and shoot more threes. It can be said that NBA entered the era of “three points shooting”. Our team is interested in how NBA is changed according to data.
In order to do the investigation, we tried to scrape the data from the official website of NBA, but there seems to be a protection of the web producer that forbidden unauthorized users to use the data from their website. Then we searched on the internet and tend to find the best data website of NBA. After some comparison, we decide to scrape the data from the website https://www.basketball-reference.com/leagues/NBA_2020.html#all_team-stats-base. We used the table of Miscellaneous Stats. We will analyze the relationship between winning percentage with different attributes like three points attempt rate. We also wants to find the difference in different categories, like pace, through 2000-2019. We will use data science and machine learning to predict NBA games.
For more information about the techniques we used, check https://www.insidescience.org/news/artificial-intelligence-nba-basketball. Also, https://en.wikipedia.org/wiki/Logistic_regression.
Since every column has its abbreviate name. So we provide you the glossary.
Age – Player’s age on February 1 of the season
W – Wins
L – Losses
PW – Pythagorean wins, i.e., expected wins based on points scored and allowed
PL – Pythagorean losses, i.e., expected losses based on points scored and allowed
MOV – Margin of Victory
SOS – Strength of Schedule; a rating of strength of schedule. The rating is denominated in points above/below average, where zero is average.
SRS – Simple Rating System; a team rating that takes into account average point differential and strength of schedule. The rating is denominated in points above/below average, where zero is average.
ORtg – Offensive Rating. An estimate of points produced (players) or scored (teams) per 100 possessions
DRtg – Defensive Rating
An estimate of points allowed per 100 possessions
NRtg – Net Rating; an estimate of point differential per 100 possessions.
Pace – Pace Factor: An estimate of possessions per 48 minutes
FTr – Free Throw Attempt Rate.Number of FT Attempts Per FG Attempt
X3PAr or 3PFGAR– 3-Point Attempt Rate. Percentage of FG Attempts from 3-Point Range
TS – True Shooting Percentage. A measure of shooting efficiency that takes into account 2-point field goals, 3-point field goals, and free throws.Offense Four Factors
eFG – Effective Field Goal Percentage. This statistic adjusts for the fact that a 3-point field goal is worth one more point than a 2-point field goal.
TOV – Turnover Percentage. An estimate of turnovers committed per 100 plays.
ORBOffensive Rebound Percentage. An estimate of the percentage of available offensive rebounds a player grabbed while he was on the floor.
FT/FGA – Free Throws Per Field Goal Attempt. Defense Four Factors
DRB. – Defensive Rebound Percentage. An estimate of the percentage of available defensive rebounds a player grabbed while he was on the floor.
DRB – Defensive Rebound Ball
ORB – Offensive Rebound Ball
TRB – Total Rebound Ball
AST – Assistant
G – Games
MP – Minutes Played
FG – Field Goals
FGA – Field Goal Attempts
FG. – Field Goal Percentage
X3P or 3PFG – 3-Point Field Goals
X3PA or 3PFGA– 3-Point Field Goal Attempts
X3P. or 3PFGAP – 3-Point Field Goal Percentage
X2P or 2PFG– 2-Point Field Goals
X2PA or 2PFGA– 2-point Field Goal Attempts
X2P. or 2PFGP– 2-Point Field Goal Percentage
Attend. – Attendance
WP – Winning Percentage