Contract PHP Developer for Long-Term Contract to Develop and Support Web Crawlers

キャンセルされた 投稿 Jan 25, 2006 着払い
キャンセルされた 着払い

We are interested in hiring a PHP developer/team on a long-term contract basis who is knowledgeable about the design and development of web crawlers using PHP. We currently have crawlers developed for two retail sites—one with local store information and the other with online store information—and are interested in significantly expanding the number of sites crawled (see listings below). Database used is MySQL database.

The current code is written in PHP and are of two varieties:

1. “On-Demand” Scripts – utilized to crawl sites with specific information and populating a table with that information

2. Nightly Batch Scripts – executed nightly to send and receive data from a website and populating a table with that information

We also have the need for the following scripts/functions:

1. Keyword Matching Script – Script to ensure that duplicate records are not being created for the same information. Code must gauge the proximity of the match, based on an algorithm which matches key words. This script will be utilized to match stores, products, and listings and thus be fairly flexible.

2. Deletion Script – Script which deletes repetitive or expired listings from tables and inserts them into the table of purged records ([purged_record]).

3. Record Matching Script – Code which attempts to match nearly identical records found during the execution of the crawlers to existing records in various tables based on matching criteria. Algorithm will be explained.

4. Local Retailers Store Number Script – Given stores with a brick and mortar presence in the database, will search the brick and mortar stores’ websites to retrieve the store information, specifically the store number (e.g., “BestBuy #48” matching an actual physical location) and any other missing store information not already recorded in the store table.

5. Product Picture Script – Given a product, be able to capture and store the picture of the product in a picture table in the database. Avoid duplication of pictures.

6. UPC Code Matching Algorithm – Match a product based on various attributes (namely Brand and Model and/or Key Words) to a product in the UPC database to retrieve the product’s UPC Code, which will be stored in the product table. Use the Keyword Matching Script to achieve this.

Addendum for Future PHP Scripts:

SITES: We will be expanding the web crawls to search for all items. This listing might include, and are not limited to, Books, Cellular products, Clothing & Shoes, Computers, Food, Gas Prices, Entertainment, Housewares, Jewelry, Music & DVDs, Office Supply, Pets, Services, Travel, Toys, and Wholesale.

Local Shopping Sites:

1. [url removed, login to view]

2. [url removed, login to view]

3. [url removed, login to view]

4. [url removed, login to view]

Online Shopping Sites:

1. [Trademark masked].com

2. [url removed, login to view]

3. [url removed, login to view]

4. [url removed, login to view]

5. [url removed, login to view]

6. [url removed, login to view]

7. [url removed, login to view]

8. [url removed, login to view]

9. [url removed, login to view]

10. [url removed, login to view]

** Note: Furthermore, the Scripts should be flexible enough (extensible) to extend this search to other price comparison websites in the future.

Brick-and-Mortar Companies which have Online Presence:

1. [url removed, login to view]

2. [url removed, login to view]

3. CompUSA

4. [url removed, login to view]

5. [url removed, login to view]

6. [url removed, login to view]

7. B&[url removed, login to view]

8. J&R

9. PC Richards

** Note: For local stores and pricing of products above, you will need to specify a zip code. The script should be able to take a text file of zip codes and run the search accordingly. The script can run in a batch manner with zip-code text file as a feed.

NOTES:

1. For the local online sites, the Script should read zip codes in from a text file and search within a certain radius from that zip code. The text file will contain zip codes separated by carriage returns.

2. The Script should crawl all subcategories under the primary categories on each website. This hierarchy should be retained in the database. The script should be intelligent enough to search down each hierarchy of items categorized under the parent category as specified by each individual site. Hierarchical format will be specified in the tables.

3. Data will need to be captured in the respective product tables, include the category, product, and brand tables without duplication of records. See each table for more information.

4. The script should enable another PHP page to call it On-Demand (i.e., initiated when required) and the data parsed into the appropriate tables, without duplication of records.

5. Contractor will agree to and sign the Service Agreement contract. Finalization of the contract will be dependent on progress during the first week, to ensure a mutual fit.

6. Contractual rate is negotiated for each script/website or a quote provided on a bundled basis. Each payment will be contingent upon meeting milestones throughout the project (i.e., delivering working code). The final payment will be made before the end of the entire project dependent on meeting all requirements and conclusion of testing.

7. Contractor must participate in bi-weekly meetings and provide daily status updates. Once a project starts, you must provide daily updates and must be available on a ad-hoc basis for conference calls via Skype or a toll-number to the United States.

8. Scripts must be periodically backed up, uploaded and tested on our servers. Final acceptance of a script is contingent on testing and approval by management.

9. Contractor will provide project timeline, noting milestones of when various scripts and web crawlers will be delivered. If the contractor anticipates that it will miss the deadline, management must be notified or risk breach of contract. Any delays in the project timeline must be agreed by both parties.

10. Contractor will work closely with PeerShopper™ team members in testing and integrating scripts into existing code.

11. Contractor will provided evidence of previous work using PHP, preferably for developing web crawlers.

12. You must include 4 months' of assistance and upgrades on a weekly basis.

情報処理 PHP スクリプトインストール

プロジェクトID: #41239

プロジェクトについて

6個の提案 リモートプロジェクト アクティブ Feb 11, 2006

6人のフリーランサーが、平均$290 で、この仕事に入札しています。

siddhartha1

a placeholder bid, thanx for the invitation, we are interested but this would cost you more, pls revise you budget range,thanx

$300 USD 30日以内
(34件のレビュー)
7.6
cliver

Hello, Please look at the PMB. Thanks, Sergey

$300 USD 3日以内
(25件のレビュー)
6.6
Spin

Thanks for the invitation, are ready to cooperate.

$300 USD 30日以内
(3件のレビュー)
4.6
vslook

We are a Toronto, Canada based web development firm and have the team you need for ongoing development of your crawlers. We have exteneive experience working with variety of crawlers in PHP/MySQL and have developed cus もっと

$300 USD 7日以内
(レビュー1件)
4.3