Skip reason is Failed although excluded by rule - why?

Started by Manuela, June 22, 2023, 06:56:52 AM

Previous topic - Next topic

Manuela

Hi,
I am new to Webcopy. This is a great tool for me.

Finally I found a tool that is working for me. I have tried several tools (wget, httrack and dumpgenerator) and could not bring them to work (maybe I am too stupid). After I had invested hours of searching, trying and investigation I found Webcopy which is really working even when login to a website is required.

I want to make a copy of my wikis on Miraheze because Miraheze seems to cease service, at least there is a discussion. I want to have at least a working copy on my hard drive.

Many rules are necessary to prevend unwanted downloads in a MediaWiki website, so I created a bunch of rules for that.

The problem is a minor one, but it seems to waste time. I seems to me that Webcopy is trying to download the page although it is already excluded by rule and no need for further processing.

I have made a screenshot of an ongoing download that I did not want to stop because it is already runnung for hours.

Thank you for your support,
--Manuela

BTW, creating the account was a challenge. For visually impaired people it is almost impossible to solve the riddle, I had to zoom in and request a new one several times.

rumia

you can test example urls with rule checker, is very handy to see why it not working.

documentation

Richard Moss

Hello,

Thanks for the message. There was a bug introduced in a recent rework of WebCopy rule processing which meant if you had rules that checked content types or minimum or maximum file sizes then WebCopy would incorrect issue a HEAD test to get the content type and length. This was resolved in the later 1.9.1 builds, so if you have a current 1.9.1 build (870 at time of writing) this issue should be resolved.

I'm not 100% sure if this is related to the issue you're having, but it is (or was!) the only occurrence I'm aware of where excluded URLs would be accessed.

Regards;
Richard Moss
Read "Before You Post" before posting (https://forums.cyotek.com/cyotek-webcopy/before-you-post/). Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.

Manuela