Find Jobs
Hire Freelancers

Cluster Analysis (using existing code) / MySQL database

$30-250 USD

クローズ
投稿日: 11年以上前

$30-250 USD

完了時にお支払い
I have .[login to view URL] that I think contains all the pieces you need, if you think something is missing then please let me know. The rough workflow is as follows: [login to view URL] will set up a database for clustering that is in the correct format. Ideally I would like to leverage my existing database on GoDaddy but I would be open to other suggestions. You will need to change the "data" table at the very bottom so that it is a view across your actual page data, which is expected to show the page id (a unique identifier) and a hash of the DOM. When you run the script you can specify a database schema, all of the tables will go in that schema. Compile qfp.c with "gcc -o qfp qfp.c". Run [login to view URL], this script takes a lot of options and will allow you to customize where the database and all the tables are. If you have done all this, congratulations, you have clusters in your database! The [login to view URL] table contains the actual clusters, for each page it will have a (rep_id, page_id) pair, where the rep_id is essentially the cluster id (it is actually just the id of the lowest page in the cluster). Depending on what you want to do with the clusters, this may be all you need. You can compile [login to view URL] with "javac -cp [login to view URL]:. web_clustering/[login to view URL]". You may want to make a copy of this file for your modifications, that way you can refer back to the original if you delete too much and screw something up. If you compile and run web_clustering/[login to view URL], it will generate a web site that shows your clusters, gives screenshots of common pages in the clusters (assuming you have screenshots enabled on Neha's crawler), and lets you look at their DOMs pretty easily. You have to compile and run it from the main folder, not from within web_clustering, as it is part of the web_clustering java package. Unfortunately you will need to dig into the Java file to change things like table names, the output location, and the location of your screenshot and DOM files. These are all hard-coded and spread through multiple files, so this part will be a little time consuming. Run it with the "-M" flag and just delete any code that did not follow this execution path (there is a lot of it, he added lots of different options to this code as time went on). Then you will probably need to modify the SQL queries to grab the page data correctly, I am not certain how much work this will be though. If you can get this to compile and run, you should be left with an output directory that contains a bunch of folders and files, one of which is "[login to view URL]". Opening this file in a web browser will give you a main page that shows your 225 most common clusters, the most common screenshots of the pages in those clusters, and have links for more information about the clusters, DOMs, etc. Let me know if you have any questions about any of this, I would be happy to answer them. Good luck! Happy Bidding!
プロジェクト ID: 3996385

プロジェクトについて

4個の提案
リモートプロジェクト
アクティブ 11年前

お金を稼ぎたいですか?

Freelancerで入札する利点

予算と期間を設定してください
仕事で報酬を得る
提案をご説明ください
登録して仕事に入札するのは無料です
この仕事に4人のフリーランサーが、平均$208 USDで入札しています
ユーザーアバター
I am Java expert. I am want to help you here. Please check your personal inbox for more details. I will wait you. Thanks, AMit
$250 USD 7日以内
4.7 (100 レビュー)
6.3
6.3
ユーザーアバター
Hello sir. I read all your requirements. And i am good at all that. Please check attached doc for my previous works. Hope to hear from you soon. Thanks!!
$200 USD 10日以内
5.0 (41 レビュー)
4.8
4.8
ユーザーアバター
HI I am confident to handle this will work until you are satisfied Thanks With REgards i am keenly interested in this project
$195 USD 4日以内
5.0 (16 レビュー)
4.0
4.0
ユーザーアバター
Petra is a developer group experienced 5-years in web development, desktop programming and database design and programming. We have excellent expertise in web Development languages and tools (PHP, JOOMLA, DRUPAL, Magento, HTML, CSS,AJAX, JavaScript, SEO, word press etc),programming languages (Java, C#) and database design (Oracle SQL, MySql, MS. SQL Server).
$185 USD 7日以内
0.0 (1 レビュー)
1.6
1.6

クライアントについて

UNITED STATESのフラグ
Alexandria, United States
5.0
16
お支払い方法確認済み
メンバー登録日:6月 30, 2012

クライアント確認

ありがとうございます!無料クレジットを受け取るリンクをメールしました。
メールを送信中に問題が発生しました。もう一度お試しください。
登録ユーザー 投稿された仕事の合計
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
プレビューを読み込み中
位置情報へのアクセスが許可されました。
あなたのログインセッションの有効期限がきれ、ログアウトされました。もう一度ログインしてください。