Find Jobs
Hire Freelancers

Crawler to extract keywords and more, from a website

$50-500 USD

完了済み
投稿日: 8年以上前

$50-500 USD

完了時にお支払い
Executive summary: We need a script to extract all the text as ords from given websites, sort those keywords in ascending order, remove some irrelevant ones such as 'and', 'the', and associate them with the sentence that those keywords are mentioned in. - We're a company, developing chatbots used in customer service, answering their questions on their website or mobile apps. Think of "Siri for customer service". - Customers, especially smaller ones usually don't want to dedicate time to create questions and answers to their chatbots, so we have a high barrier of entry in terms of setup. It's not easy to sell expecting them to dedicate time to start using their chatbot. We want to overcome this hurdle by developing an automated content crawler. "Content" is basically most frequently used keywords and important sentences that contain these keywords. (No need to create questions, as keywords and sentences associated with them are the building blocks of a chatbot) So, we essentially need a snapshot of the website in terms of what the website is all about. Phases are: 1- A crawler to crawl plain-text-based content from various websites submitted via a form (domain and subdomains) 2- A frequency analysis to list most common keywords in ascending order. (We need a list of stop words (irrelevant words such as 'the', 'a' etc.) so that we can strip them off before concluding the keyword list. 3- List keywords and the sentences that contain those keywords. We might need to remove from this list some entries based on some rules that may make them irrelevant. We'll figure out those rules as we see some results. The remaining content will be the essence of our project. 4- Based on experience so far, we might develop a new phase to further fine tune results. Our goal is to show the finalized content to website owner and let them say "this exactly covers every bit of information about my business" 5- We'll feed those keywords and sentences (answers to those keywords) to a chatbot database. 6- Last but the not least, we'd like to feed a list of websites to execute previous 5 steps automatically, and create the same output for all of them. If you can provide a really simple proof of concept, I could accept your bid the same moment. We need to see that the results generated from this project will be helpful for us to develop chatbots and increase sales. Oppositely, a copy-paste message will not get you the job. It demonstrates that you're a serial bidder, not someone who is willing to go the extra mile to solve a real problem for a business.
プロジェクト ID: 8262662

プロジェクトについて

14個の提案
リモートプロジェクト
アクティブ 9年前

お金を稼ぎたいですか?

Freelancerで入札する利点

予算と期間を設定してください
仕事で報酬を得る
提案をご説明ください
登録して仕事に入札するのは無料です
アワード者:
ユーザーアバター
Hi, I'm not a serial bidder and your project is very interesting. withing my bid budget, I can build up a proof-of-concept that will 1-3 (cannot quote 4-5 as it heavily depend on 1-3), including database structure to store keywords / sentences. Should we discuss UI for steps 1-3?
$499 USD 5日以内
5.0 (345 レビュー)
9.0
9.0
この仕事に14人のフリーランサーが、平均$495 USDで入札しています
ユーザーアバター
Hello ther,e I will like to help you out with this project. I have experience scraping websites and it is only plain text that will be easy. Regarding the keywords, I have experience in data mining which means I know how to handle the database in order to find the most used keywords, and other importants aspect like relations between keywords which I think will be really important for you. Hope to hear back from you soon. Thanks
$416 USD 3日以内
5.0 (111 レビュー)
8.8
8.8
ユーザーアバター
Hi sir, I am scraping expert, I have did too many similar projects, please check my feedback then you will know. Can you tell me more details? then I will provide demo data for you. Thanks, Kimi
$565 USD 6日以内
5.0 (573 レビュー)
8.4
8.4
ユーザーアバター
Dear Sir, I'm very much delighted to let you know that i did data scraping with PHP-cURL, Node.js, Selenium from many sites. I just scraped the data from web site and then wrote the data in mysql database or excel or csv or xml file. I worked on many similar projects, I have big experience in data mining projects. I have written hundreds of web scrapers which scrape millions of pages each day. I'm ready to fulfill your requirement. I can finish this task in short time, with the best quality. I can assure 100% accuracy. Please give me the opportunity to do the work. With Kind Regards, Debdulal Roy Proshanta
$444 USD 3日以内
4.9 (106 レビュー)
7.7
7.7
ユーザーアバター
I want to discuss this project with you further, let me know the best suitable time for you to schedule the meeting, Feel free to message me at any time, i used to be online 14 hrs in a day on this website so probably you will get a quick response from my end.
$515 USD 12日以内
4.8 (54 レビュー)
7.1
7.1
ユーザーアバター
Hello sir after reading your requirements . I can do you project using semantics feature i have already done that it automatically gives ability for those keywords that have high frequency occurance. And A universal list of stop words and word breakers(differrent forms of verb) . My Info: i have done scrapping almost on Half of Worldwide web including ecommerce giants(Amazon,ebay,craigslist) News Feed, Social media websites, API's. I develop my own tools based on client requirements with Mulithreading, a Bot with human behaviour and Scrapping Applications with documents parsing. I Can do PDF Parsing and Capctha ByPass code as well. Contact me for further details or Demo
$305 USD 3日以内
4.9 (53 レビュー)
6.8
6.8
ユーザーアバター
Hello Dear, I can do this for you. Please send a massage in the PMB for details.......Best Regards flashsaiful
$500 USD 6日以内
4.8 (133 レビュー)
6.7
6.7
ユーザーアバター
Hello! I'm web scraping expert. I use python scrapy framework. My scripts can run on windows or linux, but linux is preferably. I can schedule scripts on server if it is required. I can scrape secured and protected sites, my crawlers can enter into login form, emulate ajax requests etc. If site block IP i can use proxy or TOR. I can try avoid captha on site in avtomatic or manual mode. I can export data into json, csv (excel), mysql, mongodb. I have a lot of finish projects (google scraping, facebook scraping, yellow pages, webshops and other sites with lists of any items). I like your project. Message me if you have questions.
$449 USD 3日以内
4.8 (107 レビュー)
6.5
6.5
ユーザーアバター
Hi I have extensive knowledge in web scraping with ruby & have 5+ years experience in building apps with Ruby on Rails. I can get this completed in 3 days. We can discuss in details via message. Thanks
$300 USD 3日以内
5.0 (9 レビュー)
4.7
4.7

クライアントについて

UNITED STATESのフラグ
New york, United States
5.0
67
お支払い方法確認済み
メンバー登録日:3月 31, 2002

クライアント確認

ありがとうございます!無料クレジットを受け取るリンクをメールしました。
メールを送信中に問題が発生しました。もう一度お試しください。
登録ユーザー 投稿された仕事の合計
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
プレビューを読み込み中
位置情報へのアクセスが許可されました。
あなたのログインセッションの有効期限がきれ、ログアウトされました。もう一度ログインしてください。