Main Menu

Recent posts

#11
WebCopy / How to disable my website from...
Last post by fiverr - September 29, 2024, 02:49:36 PM
In this post: https://forums.cyotek.com/cyotek-webcopy/is-it-possible-to-restrict-a-site-from-webcopy/ Richard talked about use Cyotek user agent.

Where & how to input the user agent into my website to prevent Webcopy?
#12
WebCopy / I believe content is downloade...
Last post by Shplad - September 03, 2024, 05:48:05 PM
Hi:

I'm fairly technical but have zero knowledge of web technology.

I've run copy operations on a website I want to download. It takes several hours to download, and the folders are full of files. However, when I double click on the first file in the folder, it shows me an unformatted page with the links on the site I had wanted to copy. However, if I click on any of those links, it simply takes my browser to the actual site on the Internet, instead of displaying local content.

What am I doing wrong?
#13
WebCopy / Re: Why some png images are re...
Last post by Richard Moss - August 25, 2024, 01:43:56 PM
How curious. WebCopy does download to temporary files and then renames when downloaded, but they should be in the appropriate destination folder not temp.

Firstly, what version of WebCopy are you using.

Secondly, did you save the project and if so are you able to share it for me to investigate. You can send the file via PM if you do not wish to make it public.

Alternatively, if you're able to reproduce with a distilled sample, please provide details.

Regards;
Richard Moss
#14
WebCopy / Why some png images are rename...
Last post by stetho7070 - August 21, 2024, 11:02:33 AM
Why some images, for eg.: https://img.7zap.com/images/oem/vag3/groups/3.png are downloaded and saved but renamed and stored in TEMP folder ?

https://img.7zap.com/images/oem/vag3/groups/3.png -> C:\Users\USER_PROFILE\AppData\Local\Temp\br3clmdl.hsh

#15
WebCopy / Re: Need to control timeout fo...
Last post by Richard Moss - August 18, 2024, 08:00:09 AM
Hello,

The default timeout is 10 seconds but WebCopy currently doesn't expose this as a configuration setting. I've added issue #516 to expose the setting and it will be in the next nightly update.

Retrying to failed downloads is currently slated for 1.11 (#206). That was intended to be an automatic feature, but I've added a note to the issue regarding the possibility of manually triggering a retry.

PS: I don't have a huge amount of time for support requests unless they relate to genuine bugs or feasible feature enhancements. The software is free, so for the most part you're on your own.
#16
WebCopy / Re: Need to control timeout fo...
Last post by Awful_Genius - August 09, 2024, 12:42:16 PM
Hi!
Any reaction to post? Any updates? Is this forum for serious discussion or just for throwing dust in the eyes?

Update from my side: tested with latest night build (cyowcopy-1.10.0.898-x64-nightly) - problem is still exists. For one test site with low speed ALL files targeted for crawling (jpg pictures) is get "The operation has timed out" and files wasn't downloaded.

Developers, have you plans to fix it?

Thanks
#17
WebCopy / Asking for a real case explana...
Last post by jlomo - July 14, 2024, 06:27:17 PM
Hello,
What will be the step by step procedure if I want to load only the pages linked to https://www.lacoccinelle.net/211213.html from the site https://www.lacoccinelle.net?
Thank you so much.
#18
WebCopy / Failed to load image resource ...
Last post by Steve_Gibson - July 03, 2024, 07:35:58 PM
I have long been a huge fan and promoter of WebCopy. I have a long-running (19 years) popular podcast ("Security Now!") where I've several times promoted and strongly recommended WebCopy. It has never given me any trouble... until today.

Today, upon startup, the opening splash screen came up and stalled. Finally, after quite a pause, the following series of five "Failed to load image resource" errors were presented in succession:


Since I hadn't run WebCopy for a while I figured I'd download the latest and greatest. This jumped me from a 2021 release to 2023. But... no change.

The only thing funky about my world may be that this machine I'm using is still Win7... but WebCopy states that it's still compatible with Win7, and the last time I used it I had no problems.

Note that I searched high and low on this site for any way to create a "ticket" or just to let Richard know about this little glitch. Also... after dismissing the five error messages, WebCopy ran and performed flawlessly.

So... just a heads-up, Richard.  Thanks for the terrific world-class solution.  I've sent money a few times, including again today, since I would LOVE to have you keep this alive, though it's already excellent!
#19
WebCopy / archive.org and "above root" e...
Last post by ben - June 26, 2024, 01:19:14 PM
I am trying to copy a website from archive.org, the format is something like
https://web.archive.org/web/20180914202352/http://www.motorboards.org/index.php/Main_Page

So I say "Download Everyting" and in rules
I download everything that matches "motorboards.org"
and I exclude everything that doesn't match "motorboards.org"

However the vast majority of pages that I want are ignored because of "above root" error.
Why does "above root" even appear, if I say to download everything?
Thank you
#20
WebCopy / What is meant by "Site Only"? ...
Last post by therogoc - June 11, 2024, 10:45:05 PM
This webpage has PDF and LaTeX files linked:

https://borisbabic.com/teaching/inseadqm/home/index.html

I like to download the page with the linked files for my own use.

1. Is the "Site Only" crawl mode with "Download all resources" really the correct approach? The documentation is confusing in regard to domain and URL.

2. The linked LaTeX files are incorrectly saved with an HTML file. How to download them correctly?