I need someone to produce either a script to clone a website or do it manually for two of my sites which I wish to archive. A script would be preferable as I have a recurring need to do this on a yearly basis. The work is to be carried out without access to the source server, as only the public content needs to be part of the archived copy. They are currently based on a CMS and I want to archive the HTML as they will no longer be updated, this is to save maintenance in the long run keeping the CMS up to date.
I've attempted this myself using wget, however the results were not satisfactory, mostly as URLs were changed or it missed cloning some paths. The copy needs to be exact as the copies will replace the running sites once completed and I don't want to lose SEO credit for the existing urls.
The site will be hosted using nginx and I will have full control over the configuration so if you need to make any tweaks to the settings to ensure that the urls remain the same that is acceptable.
Suggested solutions which use server side includes or dynamic templating to remove duplicated HTML would be great in case I do want to make small tweaks in the future.
For the avoidance of doubt (and to address the warning below) I fully own and control the two sites to be copied and can provide proof of this to any bidder/freelancer admin.
URLs will be provided to the successful bidder or on request if you need them as part of your bid (eg if you plan to rebuild the site in some way)
I have read through your project and I am incredibly interested in taking on your project. Relevant Skills and Experience graphic designer and programmer Proposed Milestones £222 GBP - completion