Hi,
I am new to Webcopy. This is a great tool for me.
Finally I found a tool that is working for me. I have tried several tools (wget, httrack and dumpgenerator) and could not bring them to work (maybe I am too stupid). After I had invested hours of searching, trying and investigation I found Webcopy which is really working even when login to a website is required.
I want to make a copy of my wikis on Miraheze because Miraheze seems to cease service, at least there is a discussion. I want to have at least a working copy on my hard drive.
Many rules are necessary to prevend unwanted downloads in a MediaWiki website, so I created a bunch of rules for that.
The problem is a minor one, but it seems to waste time. I seems to me that Webcopy is trying to download the page although it is already excluded by rule and no need for further processing.
I have made a screenshot of an ongoing download that I did not want to stop because it is already runnung for hours.
Thank you for your support,
--Manuela
BTW, creating the account was a challenge. For visually impaired people it is almost impossible to solve the riddle, I had to zoom in and request a new one several times.
you can test example urls with rule checker, is very handy to see why it not working.
documentation (https://docs.cyotek.com/cyowcopy/current/rulechecker.html)
Hello,
Thanks for the message. There was a bug introduced in a recent rework of WebCopy rule processing which meant if you had rules that checked content types or minimum or maximum file sizes then WebCopy would incorrect issue a HEAD test to get the content type and length. This was resolved in the later 1.9.1 builds, so if you have a current 1.9.1 build (870 at time of writing) this issue should be resolved.
I'm not 100% sure if this is related to the issue you're having, but it is (or was!) the only occurrence I'm aware of where excluded URLs would be accessed.
Regards;
Richard Moss
Thank you, I will give it a try.