Custom script for scraping stats from [login to view URL]

完了済み 投稿 Nov 20, 2013 着払い
完了済み 着払い

Hello,

What I need for this project is two custom scripts - preferably written in python.

The first script takes a range of dates as command line input, and scrapes all game stats for those dates from [url removed, login to view]'s scoreboard page. So for example if I wanted to scrape the games from the single date Nov 14th 2013, the script would start at the following url:

[url removed, login to view]

and then recursively follow all of the "boxscore" links that are on that page to the game stat tables, for instance here:

[url removed, login to view]

I would then need all of the "Basic" and "Advanced" stats pulled from the table at the top of the page, which are the rows of the table with the identifier "Game Final" next to the team ID. For a given date, the script would pull all games for that date and write them into a tab delimited text file, the name of which is also passed as a command line parameter. Note that team name identifiers should be scraped as listed on the top of the boxscore pages, rather than the ones listed on the scoreboard page, or the abbreviated ones found in the boxscore table. And if there are numeric values before the team name indicating national ranking, they should be discarded. In the previous example, Connecticut's team name in the csv should read "Connecticut Huskies", rather than "UCONN" or "Connecticut", (and definitely not "#19 Connecticut Huskies").

Attached are two template files. The first is for a few games' stats- each row consists of the concatenated rows of the corresponding game stat boxscore table, for only the rows labelled "Game Final." (I added the header line myself.) Some game tables also have rows labelled "Offensive Avg" and "Defensive Avg", but if these rows are present, they can be discarded- the "Game Final" rows should exist for every game, and that is what I need. Note that there are two tables on each boxscore page: one labelled "Basic" and one labelled "Advanced." I need the rows labelled "Game Final" for both of these tables, and for both teams.

Additionally: the home team is always listed on the bottom of the box score at the top of the page, but I'm not sure that the actual game stat table follows any such convention. Therefore, when you scrape the team names from the top of the boxscore page, please make sure that the Home team's Game stats are listed first in the concatenated row on the output file. This is VERY important.

The second script is very simple- for a given date parameter, it would just pull all scheduled matchups for that date- so from the page:

[url removed, login to view]

it would just pull the home and away team names, and print them to a file that consists of three columns- date, home team name, and away team name. I attached a second template for this scripts output, which is pretty self explanatory- it's just the scheduled games for a given day, with date and teams in the columns.

There are potentially some issues with IP blocking from this site- so if you could build in some protection against that, as well as some commented instructions for how to use it, I would be very grateful. I have a programming background, but it is more focused on algorithms, so my web programming proficiency is not great, or else I would do this myself. The project is also time sensitive. But as long as the code is commented I will be able to understand it.

ウェブ記事のスクラップ

プロジェクトID: #5147631

プロジェクトについて

6個の提案 リモートプロジェクト アクティブ Nov 20, 2013

アワード:

nekhbet

Thank you for the invite. I can do this project with proxy support (so you won't have issues with the IP blocking). The only issue is that I don't know Python so the solution will be in PHP. Is that a problem? Re もっと

$222 USD 5日以内
(14レビュー)
4.8

6人のフリーランサーが、平均$251 で、この仕事に入札しています。

mantislin

Hi sir, I am scraping expert, I have did too many similar projects, please check my feedback then you will know. Can you tell me more details? then I will provide demo data for you. Thanks, Kimi

$285 USD 6日以内
(73件のレビュー)
6.2
uumarkhalid31

hi, i am expert in web scraping and interested in this project, let me do this work with perfection, accuracy and according to your requirements thanks

$126 USD 3日以内
(30件のレビュー)
5.2
WebDevelopers11

hi there i am an expert web scraper and minor too, i have good team to d projects like you just posted. i am interested to do it in this lower date and time, with 100% accuracy assurance. Award me so that i can start w もっと

$105 USD 1日以内
(4件のレビュー)
3.2