|
Post by Lasagna and tears of failure on Dec 19, 2020 18:43:28 GMT -8
there is a better way of doing the above with an array and a loop, but still figuring out R syntax
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 19, 2020 18:51:40 GMT -8
to run it first you have to install R all of this is on Windows, anything else, if you cant find it let me know and I'll find the links for you mirror.las.iastate.edu/CRAN/(download hyper mirror.las.iastate.edu/CRAN/bin/windows/base/R-4.0.3-win.exe ) once you install that open it and install the vrest package, choose "install package" you will get a HUGE list scroll down and double click on it Download and install R Studio rstudio.com/products/rstudio/download/#downloadopen it and past this code and hit enter rm(list=ls()) options(stringsAsFactors = FALSE) options(digits=3) library('rvest')
path <- "https://bballsim2020.com/"
# ADD YEAR PLUS / FOR PAST SEASONS
teams <- c("Celtics", "Heat", "Nets", "Knicks", "Magic", "76ers", "Wizards", "Hawks", "Hornets", "Bulls", "Cavaliers", "Pistons", "Pacers", "Bucks", "Raptors", "Mavericks", "Nuggets", "Rockets", "Timberwolves", "Spurs", "Jazz", "Grizzlies", "Warriors", "Clippers", "Lakers", "Suns", "Trail Blazers", "Kings", "SuperSonics")
colMax <- function(data) sapply(data, max, na.rm = TRUE)
get_player_stats <- function(player, team, start=1, end=200, per36=F){
stats <- c()
for(i in 1:length(team)){ index <- which(team[i] == teams) url <- paste(path, "/rosters/roster", index, "sched.htm", sep="") webpage <- read_html(url) links <- html_attr(html_nodes(webpage, "a"), "href") links <- links[10:(length(links)-1)]
for(j in 1:length(links)){ url <- paste(path, substr(links[j], 3, nchar(links[j])), sep="") day <- as.numeric(substr(url, gregexpr("boxes/", url)[[1]][1]+6, gregexpr("-", url)[[1]][1]-1))
if(day >= start & day <= end){ full_box <- html_table(read_html(url), header=TRUE, fill=TRUE)
if(colnames(full_box[[2]])[1] == team[i]){ my_team_index <- 2 opp_team_index <- 3 } else{ my_team_index <- 3 opp_team_index <- 2 }
box <- full_box[[my_team_index]] box_team <- box[1:(nrow(box)-2), 1:(ncol(box)-1)] box_team[,3:16] <- lapply(box_team[,3:16], as.numeric) colnames(box_team)[6:7] <- c("X3P", "X3PA") if(colnames(box_team)[1] == team[i]){ index <- which(box_team[,1] == player) if(length(index) > 0){ colnames(box_team)[1] <- "Team" stats <- rbind(stats, box_team[index,]) } } } } }
if(is.null(stats)){ cat("No stats found.\n") } else if(nrow(stats) == 0){ cat("No stats found.\n") } else{
stats <- cbind.data.frame(GM=nrow(stats), stats[, 3:16])
cat("AVERAGES\n") averages <- colMeans(stats)
print(round(averages, digits=1))
if(per36 == TRUE){ cat("\nPER 36\n") constant <- (36 / averages[2]) for(i in 2:length(averages)){ averages[i] <- constant * averages[i] } print(round(averages, digits=1)) }
cat("\nPERCENTAGES\n") print(c(sum(stats$FG) / sum(stats$FGA), sum(stats$X3P) / sum(stats$X3PA), sum(stats$FT) / sum(stats$FTA))) } }
get_team_stats <- function(team){ for(i in 1:length(team)){ cat(team[i]) cat("\n") get_player_stats(team[i], deparse(substitute(team)), start = 1) cat("\n") } } then paste this at the ">" get_player_stats("LeBron James", "Jazz", start = 36, end = 45) substitute player name and team, with day numbers for start and end. in my testing much over 10 days times out because web pages close connections over a certain amount of time. what is this for?
|
|
|
Post by Lasagna and tears of failure on Dec 19, 2020 18:53:11 GMT -8
@cmart
it allows you to scrape the box scores for a set of day that you determine to see how players do
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 19:54:35 GMT -8
I have mac
|
|
|
Post by Lasagna and tears of failure on Dec 19, 2020 21:12:31 GMT -8
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:12:56 GMT -8
Which one better?
|
|
|
Post by Lasagna and tears of failure on Dec 19, 2020 21:13:03 GMT -8
@sargo iOS links above
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:13:21 GMT -8
Oh I gotta download both
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:13:39 GMT -8
|
|
|
Post by Lasagna and tears of failure on Dec 19, 2020 21:13:45 GMT -8
you need both, you have to install "R" for "R studio" to work
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:13:52 GMT -8
You mean OS X right?
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:14:27 GMT -8
Imagine trying to do this on iPhone
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:14:53 GMT -8
Guess iPad makes sense
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:15:08 GMT -8
I need to do it tho
|
|
|
Post by Lasagna and tears of failure on Dec 19, 2020 21:15:21 GMT -8
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:15:41 GMT -8
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 19, 2020 21:16:25 GMT -8
I'll do it once I'm done spamming
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 9:03:14 GMT -8
I'll do it once I'm done spamming did you get everything installed?
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 9:03:57 GMT -8
gotta say the programming change I did last night... so amazing... gotta figure out the array loop to make it better though
|
|
JR
Wait List
Wait List
Posts: 38,037
|
Post by JR on Dec 20, 2020 9:04:46 GMT -8
gotta say the programming change I did last night... so amazing... gotta figure out the array loop to make it better though What you do?
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 9:07:30 GMT -8
gotta say the programming change I did last night... so amazing... gotta figure out the array loop to make it better though What you do? bbs2020.proboards.com/post/310103/threadso with one command it spits out your entire team > get_team_stats(start = 81) Michael Kidd-Gilchrist AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 35.7 8.0 19.3 1.1 2.4 2.4 3.1 5.7 1.7 1.6 1.4 1.6 0.9 19.6
PERCENTAGES [1] 0.415 0.471 0.773
Otto Porter AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 32.1 5.4 15.3 1.1 3.6 3.7 5.1 8.0 3.0 1.6 1.0 2.0 0.1 15.7
PERCENTAGES [1] 0.355 0.320 0.722
DeJuan Blair AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 34.4 6.4 15.3 0.0 0.0 3.4 5.7 11.1 2.0 2.3 0.7 2.1 1.0 16.3
PERCENTAGES [1] 0.421 NaN 0.600
Mike Scott AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 29.6 4.1 9.9 0.0 0.6 0.9 1.3 4.3 3.0 1.0 0.4 1.6 0.6 9.1
PERCENTAGES [1] 0.420 0.000 0.667
Paccelis Morlende AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 36.7 1.9 3.7 0.0 0.0 1.0 1.1 9.1 2.0 2.4 2.0 0.9 3.6 4.7
PERCENTAGES [1] 0.500 NaN 0.875
Chandler Parsons AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 12.1 1.6 3.7 0.3 1.1 1.3 1.4 1.6 1.0 0.7 0.6 0.6 0.0 4.7
PERCENTAGES [1] 0.423 0.250 0.900
Dennis Schroder AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 34.1 6.3 11.7 1.0 1.7 0.4 0.7 2.3 6.3 3.9 0.6 1.9 0.0 14.0
PERCENTAGES [1] 0.537 0.583 0.600
John Henson AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 13.6 1.6 4.6 0.0 0.0 0.1 0.4 4.3 0.4 1.3 0.1 0.4 0.4 3.3
PERCENTAGES [1] 0.344 NaN 0.333
Mason Plumlee AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 11.6 0.6 3.4 0.0 0.0 0.6 1.3 2.4 0.1 1.4 0.4 0.1 0.3 1.7
PERCENTAGES [1] 0.167 NaN 0.444
Ronny Turiaf No stats found.
|
|
Go Niners
Wait List
Wait List
Niners!
Posts: 35,751
|
Post by Go Niners on Dec 20, 2020 9:13:38 GMT -8
I'll do it once I'm done spamming did you get everything installed? Got lazy, gonna try it later
|
|
ashes
Wait List
Wait List
Posts: 36,162
|
Post by ashes on Dec 20, 2020 14:53:01 GMT -8
oh
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 15:13:09 GMT -8
later tonight, I'll work on the looping array to make it even more friendly to update season to season
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 21:19:47 GMT -8
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 21:26:23 GMT -8
rm(list=ls()) options(stringsAsFactors = FALSE) options(digits=3) library('rvest')
path <- "https://bballsim2020.com/"
# ADD YEAR PLUS / FOR PAST SEASONS
teams <- c("Celtics", "Heat", "Nets", "Knicks", "Magic", "76ers", "Wizards", "Hawks", "Hornets", "Bulls", "Cavaliers", "Pistons", "Pacers", "Bucks", "Raptors", "Mavericks", "Nuggets", "Rockets", "Timberwolves", "Spurs", "Jazz", "Grizzlies", "Warriors", "Clippers", "Lakers", "Suns", "Trail Blazers", "Kings", "SuperSonics")
colMax <- function(data) sapply(data, max, na.rm = TRUE)
get_player_stats <- function(player, team, start=1, end=200, per36=F){
stats <- c()
for(i in 1:length(team)){ index <- which(team[i] == teams) url <- paste(path, "/rosters/roster", index, "sched.htm", sep="") webpage <- read_html(url) links <- html_attr(html_nodes(webpage, "a"), "href") links <- links[10:(length(links)-1)]
for(j in 1:length(links)){ url <- paste(path, substr(links[j], 3, nchar(links[j])), sep="") day <- as.numeric(substr(url, gregexpr("boxes/", url)[[1]][1]+6, gregexpr("-", url)[[1]][1]-1))
if(day >= start & day <= end){ full_box <- html_table(read_html(url), header=TRUE, fill=TRUE)
if(colnames(full_box[[2]])[1] == team[i]){ my_team_index <- 2 opp_team_index <- 3 } else{ my_team_index <- 3 opp_team_index <- 2 }
box <- full_box[[my_team_index]] box_team <- box[1:(nrow(box)-2), 1:(ncol(box)-1)] box_team[,3:16] <- lapply(box_team[,3:16], as.numeric) colnames(box_team)[6:7] <- c("X3P", "X3PA") if(colnames(box_team)[1] == team[i]){ index <- which(box_team[,1] == player) if(length(index) > 0){ colnames(box_team)[1] <- "Team" stats <- rbind(stats, box_team[index,]) } } } } }
if(is.null(stats)){ cat("No stats found.\n") } else if(nrow(stats) == 0){ cat("No stats found.\n") } else{
stats <- cbind.data.frame(GM=nrow(stats), stats[, 3:16])
cat("AVERAGES\n") averages <- colMeans(stats)
print(round(averages, digits=1))
if(per36 == 1){ cat("\nPER 36\n") constant <- (36 / averages[2]) for(i in 2:length(averages)){ averages[i] <- constant * averages[i] } print(round(averages, digits=1)) }
cat("\nPERCENTAGES\n") print(c(sum(stats$FG) / sum(stats$FGA), sum(stats$X3P) / sum(stats$X3PA), sum(stats$FT) / sum(stats$FTA))) } }
get_team_stats <- function(start=1, end=200, per36=F){ team_N <- "Celtics" player_N <- c("Michael Kidd-Gilchrist", "Otto Porter","DeJuan Blair" ,"Ronny Turiaf", "Mason Plumlee" , "John Henson" , "Marko Todorovic" , "Mike Scott" , "Chandler Parsons" , "Dennis Schroder", "Paccelis Morlende")
for (i in player_N) { print(i) get_player_stats(i, team_N , start , end) cat("\n") cat("\n") }
}
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 21:26:57 GMT -8
the key changes are to this function
get_team_stats <- function(start=1, end=200, per36=F){ team_N <- "Celtics" player_N <- c("Michael Kidd-Gilchrist", "Otto Porter","DeJuan Blair" ,"Ronny Turiaf", "Mason Plumlee" , "John Henson" , "Marko Todorovic" , "Mike Scott" , "Chandler Parsons" , "Dennis Schroder", "Paccelis Morlende")
for (i in player_N) { print(i) get_player_stats(i, team_N , start , end) cat("\n") cat("\n") }
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 21:45:53 GMT -8
oh heck yeah, finally got the per/36 part of the program to work too
[1] "Chandler Parsons" AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 12.1 1.6 3.7 0.3 1.1 1.3 1.4 1.6 1.0 0.7 0.6 0.6 0.0 4.7
PER 36 GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 36.0 4.7 11.0 0.8 3.4 3.8 4.2 4.7 3.0 2.1 1.7 1.7 0.0 14.0
PERCENTAGES [1] 0.423 0.250 0.900
|
|
|
Post by Lasagna and tears of failure on Dec 20, 2020 21:48:56 GMT -8
|
|
|
Post by hf on Dec 20, 2020 21:56:47 GMT -8
oh heck yeah, finally got the per/36 part of the program to work too [1] "Chandler Parsons" AVERAGES GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 12.1 1.6 3.7 0.3 1.1 1.3 1.4 1.6 1.0 0.7 0.6 0.6 0.0 4.7
PER 36 GM MIN FG FGA X3P X3PA FT FTA REB A PF ST TO BL PTS 7.0 36.0 4.7 11.0 0.8 3.4 3.8 4.2 4.7 3.0 2.1 1.7 1.7 0.0 14.0
PERCENTAGES [1] 0.423 0.250 0.900 Oh that’s niceeee
|
|