Chrome Extension that Downloads Pages and Takes Screenshots

キャンセルされた 投稿 Jul 3, 2011 着払い
キャンセルされた 着払い

I need a Chrome extension that pulls a JSON list of urls from a RESTful web service url (to be defined later). It must then navigate to each url (processing it's javascript), then save the HTML (not necessarily equivalent to view source, but the post-javascript HTML) to an Amazon S3 bucket. It must then take a screenshot of the rendered page and save it to another Amazon S3 bucket.

The extension needs to keep track of which captures succeeded and which failed. Every n number of urls, it needs to send these success/fail statuses to another RESTful web service as a JSON object.

When the extension is finished processing/trying its batch of urls, it needs to send its remaining success/fail statuses and request another list of urls from the original web service.

**Tricky Parts**

1. Getting the page source post-javascript render like "inspect element" rather than "view source".

2. Getting a screenshot of the entire page rather than the visible window.

**Web Service Interface**

The GetUrls method will look similar to this:

GetUrls() ? JSON(List<string url, guid url_id>, Settings<AmazonCredentials, bool doScreenshot, string screenshotFormat, string screenshotQuality>)

You can pretty much define what you need, and we'll make GetUrls return it to you. The point is, everything that is configurable will come down from the server except for the web service url. That will be configured in the text file.

The url_id will be Globally Unique and should be used as the bucket name to store the HTML and the Screenshot image.

The SendStatus method will look similar to this:

SendStatus(JSONList<guid url_id, byte HTMLStatus, byte ScreenshotStatus>)

**...See the Detailed Description for more...**

## Deliverables

I need a Chrome extension that pulls a JSON list of urls from a RESTful web service url (to be defined later). It must then navigate to each url (processing it's javascript), then save the HTML (not necessarily equivalent to view source, but the post-javascript HTML) to an Amazon S3 bucket. It must then take a screenshot of the rendered page and save it to another Amazon S3 bucket.

The extension needs to keep track of which captures succeeded and which failed. Every n number of urls, it needs to send these success/fail statuses to another RESTful web service as a JSON object.

When the extension is finished processing/trying its batch of urls, it needs to send its remaining success/fail statuses and request another list of urls from the original web service.

**Tricky Parts**

1. Getting the page source post-javascript render like "inspect element" rather than "view source".

2. Getting a screenshot of the entire page rather than the visible window.

**Web Service Interface**

The GetUrls method will look similar to this:

GetUrls() ? JSON(List<string url, guid url_id>, Settings<AmazonCredentials, bool doScreenshot, string screenshotFormat, string screenshotQuality>)

You can pretty much define what you need, and we'll make GetUrls return it to you. The point is, everything that is configurable will come down from the server except for the web service url. That will be configured in the text file.

The url_id will be Globally Unique and should be used as the bucket name to store the HTML and the Screenshot image.

The SendStatus method will look similar to this:

SendStatus(JSONList<guid url_id, byte HTMLStatus, byte ScreenshotStatus>)

The GetSettings method will look similar to this:

Web service urls need to be configurable variables by modifying a text file, perhaps the manifest.

**Screenshot Specs**

The screenshot functionality should be able to capture the entire page (rather than the visible window).

The file format and quality must be configurable via a text file.

**S3 Specs**

Amazon S3 credentials will come down in the GetUrls request.

For the purposes of development, you'll need to factor in a small cost for testing S3 storage. Probably well under $10.

**Non Functional Requirments**

Factor a small amount of scope creep into your bid.

Code needs to be very self-documenting and well commented. Factor in the time necessary to clean up your code assuming that somebody else will be reading it. Maintainability is very important here. I will be using this code as part of a larger project, and will need to tweak it after you have written it.

アマゾンウェブサービス アップルサファリ C#プログラミング グーグルクロム JavaScript PHP スクリプトインストール シェルスクリプト ソフトウェアアーキテクチャ ソフトウェアテスト

プロジェクトID: #3419775

プロジェクトについて

リモートプロジェクト アクティブ Jul 5, 2011