Limit download to only first page?

Started by End User, September 16, 2016, 12:27:53 AM

Previous topic - Next topic

End User

Hello!

I would like to scan several news sites regularly.
But I don't want to download entire sites.

How should I limit my project to download only first pages of several sites?
(I have a list of URLs)

I tried to read documentation, but this question was not explained there.

Thank you!

Richard Moss

Hello,

Thanks for the question and welcome to the forums. And sorry for the delay in responding, I'm not allocating much time for support requests at the moment! Unfortunately, I have to concede a bit of defeat here - it's not a use case I originally considered, and the software makes it remarkably difficult to achieve - I spent a bit of time trying, but the software is a little bit stubborn in this regard - if you add a rule to exclude everything, then it doesn't matter if you add a rule to say include something, it ignores it. Unhelpful :(

You could add rules to exclude all top level pages, but that is time consuming and isn't future proof if the site's structure changes. There's a hacked in level flag which is used by Quick Scan, but as it was just a hack I never exposed it via the front end for normal use.

If you *only* want to download a single page and no other resources at all, you could use the command line version easily enough, using something similar to the below:

wcopy http://www.cyotek.com /o download.html

You could then stick multiple calls to this in a batch file to download the URL list you mentioned. However, this doesn't perform any remapping - it's a direct download only.

I will make sure to include this scenario when I finally get WebCopy's rule system overhauled, so thank you for the suggestion. I hope the above information helps, if not please let me know.

Regards;
Richard Moss
Read "Before You Post" before posting (https://forums.cyotek.com/cyotek-webcopy/before-you-post/). Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.

End User

G'day!

It's good to hear that the product is growing in functionality!

I'll stick around to be informed ::)

Meanwhile, gonna try to get wget for Windows :D

Thank you!

Richard Moss

Hello

Thanks for the follow up. I'm sorry that WebCopy wasn't suitable for your needs, but thanks for the feedback which will hopefully improve the product for future users.

Regards;
Richard Moss
Read "Before You Post" before posting (https://forums.cyotek.com/cyotek-webcopy/before-you-post/). Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.