Main Menu

Recent posts

#1
WebCopy / What is meant by "Site Only"? ...
Last post by therogoc - June 11, 2024, 10:45:05 PM
This webpage has PDF and LaTeX files linked:

https://borisbabic.com/teaching/inseadqm/home/index.html

I like to download the page with the linked files for my own use.

1. Is the "Site Only" crawl mode with "Download all resources" really the correct approach? The documentation is confusing in regard to domain and URL.

2. The linked LaTeX files are incorrectly saved with an HTML file. How to download them correctly?
#2
WebCopy / Using 'input-file'
Last post by Jake Morrison - June 11, 2024, 09:19:27 PM
Hi,
Both the CLI & GUI command lines allow '/input-file <file>' to be used.
I'm wondering if there is a description available of the file format or any other help with using 'input-file'.

Also, I do not find a way to use an input file within the program. Have I missed it somehow?

thanks,
Jake
#3
WebCopy / Re: How to crawl remote sitema...
Last post by lychniscuddly - June 10, 2024, 03:28:15 AM
Hi, I'm a new member.
Do you have any updates? I'm experiencing the same issue as mentioned earlier. The same version.
#4
WebCopy / Need to control timeout for we...
Last post by Awful_Genius - June 08, 2024, 08:14:29 AM
Hi all!

I'm using WebCopy for a long time and often get error: "The operation has timed out" for some files while crawling:



After some researching I got that the most likely that WebCopy use something like WebRequest .NET method and use small timeout for it:

System.Net.WebException: The operation has timed out

Sometimes Exception raised for big files, which web server can't send quickly.

So, can you add to WebCopy functionality to control requests timeouts?

Other option - redownload option for Errors tab

#5
WebCopy / How to crawl remote sitemap.xm...
Last post by chillybang - June 03, 2024, 05:42:33 PM
Hey!

I must say, I'm hardly impressed after switching from HTTrack to Cyotek - it is like a millenium switch :)

As a newbie I miss a feature or rather the way to find it - namely to crawl URLs from a remote, live sitemap.

I tried to add a sitemap as https://example.com/sitemap.xml to "Load additional URLs from file" - but this doesn't work (rises an alert on saving).

Do I miss something? What is the way to crawl all URLs from a remote live sitemap? Saving site's sitemap locally to use it as an URL list isn't very useful because sites update their sitemaps...
#6
WebCopy / Local File Links issue + some ...
Last post by DavesNotHereDude - June 02, 2024, 04:07:54 PM
I've having a hard time getting URL's to link properly on a site where the pages are hosted in multiple sub directories.

Example.

Although the root directory is https://guides.goingviralnewsletter.hookpoint.com/

The page I want to download started at https://guides.goingviralnewsletter.hookpoint.com/0c6dd8e9/school-of-hard-knockz/

I'm able download the site,  and when I open the below index.htm, I'm able to load the page (some images don't load, more on that later)

C:\Downloaded Web Sites\guides.goingviralnewsletter.hookpoint.com\0c6dd8e9\school-of-hard-knockz\index.htm

When I click a link on the loaded page, it redirects to file:///C:/0c6dd8e9/school-of-hard-knockz/LINK

Instead of the full URL.

Note, this is only when I disable "Remap references within downloaded file". When this is enabled, the index.htm on https://guides.goingviralnewsletter.hookpoint.com/0c6dd8e9/school-of-hard-knockz/ also loads "Your file couldn't be accessed" because it references file:///C:/0c6dd8e9/school-of-hard-knockz/

Second issue is some of the content doesn't load.

Couple examples:

If I download https://guides.goingviralnewsletter.hookpoint.com/0c6dd8e9/school-of-hard-knockz/ the banner image doesn't load.
If I download https://guides.goingviralnewsletter.hookpoint.com/0c6dd8e9/school-of-hard-knockz/video-references/ there are links to video's that don't render (not sure if that's fixable?)

The secondary issue is moot if I can't get the primary issue resolved.

I've searched through the forum and I've seen a couple suggestions (I thought Remap references within downloaded file was the fix for this issue, but oddly, when it's turned on, it seems to break the main index file within the root directly I added to start the crawl from)
#7
WebCopy / Re: How to auto skip warning o...
Last post by haultalented - May 30, 2024, 10:31:43 AM
Hi i'm a new member.
Have any updates? I have the same problem as mentioned. The same version.
 
#8
ImageBox / Scroll Behavior Adjustment Aft...
Last post by Zeeshanef - May 30, 2024, 10:23:32 AM
I am creating a rectangle overlay on an image by a Button click using the code below. My query is, after painting the rectangle, what's the correct method to automatically scroll to the image so that the painted rectangle is visible in the center of the available container?

private void ImageBox1_Paint(object sender, PaintEventArgs e)
    {             
      if (drawRectangle)
      {
        using (Brush brush = new SolidBrush(Color.FromArgb(64, 1, 165, 220)))
        {
          Rectangle OffSetRect = imageBox1.GetOffsetRectangle(RectDraw);
          e.Graphics.FillRectangle(brush, OffSetRect);
        }       
      }     
    }
#9
WebCopy / Special Character in Web Addre...
Last post by blaidd31204 - May 28, 2024, 04:30:10 PM
I am trying to copy the following website and I believe the special U character (the one with a tent symbol like over the #6) is causing a problem.  I have the most current version of Webcopy.  The Yield message that appears does not have any words to indicate the exact problem (I have changed my screen resolution to see if that may help see any message wording but, still no idea if that is the case).

https://forgottenrealms.fandom.com/wiki/Faer%C3%BBn

How do I get this website?  Thanks!
#10
WebCopy / Re: Multiple download threads
Last post by Yukislk - May 28, 2024, 09:41:15 AM

The more in-depth explanation is that this is an old logged issue that users have requested for a long time.
run 3