Few questions about copying a website

Started by matrix1999, August 08, 2019, 08:27:58 PM

Previous topic - Next topic

matrix1999

Hi,

First, I just want to start with a big thank you for this app.  I have used a variety of tools and platforms and so far with mixed results - wget, httrack, you name it.

I have been running Cyotek for several days now and my project is still in progress but so far so good.  However, I have a few questions:

1/ File Extensions - In my project I didn't specify anything explicitly to include or exclude and so far it is downloading everything (which is what I want).  However, I came across several files with .epub extensions and so far the tool is saving them as .bin.  Is this normal?  Did I do something wrong?  If so, can someone please shed some light with me?

2/ long path names and long file names - the site that I am trying to copy has many directories and subdirectories.  And many of them have long names.  How does Cyotek handle them just curious?  In the past, I used wget for windows and it stops copying those paths and/or files with very long names.  This could be a limitation with Windows operating systems and so I thought I'd ask.  That being said, would it be possible to overcome if a Linux version is released?

Thank you once again for this fantastic tool.  Please keep up the great work.

Regards,
Neo

Richard Moss

Hello,

Welcome to the forums and thanks for the questions.

1. WebCopy will try and remap extensions to match official mime types. bin has the type application/octet-stream but WebCopy by default won't remap file extensions of that type. If you're able to share the URL of a page containing one of these epub files I can do some testing in case there is a bug. You can find the options relating to how WebCopy handles extensions in Project Properties | General | Local Files.

2. WebCopy will try and keep filenames matching that of the website, but the total length of a path and file is too long, it will automatically try and shorten it as best it can, mostly by trimming the file name but in extreme cases it might drop back a level or two in the path as well.

Hope this helps, let me know if not.

Regards;
Richard Moss
Read "Before You Post" before posting (https://forums.cyotek.com/cyotek-webcopy/before-you-post/). Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.


Richard Moss

Hello,

That's odd - I just ran a quick test on my desktop using out-of-the-box settings and it looks like the files are downloading fine.

Two quick questions:

* What version of WebCopy are you using
* Do you have an application installed that will open epub books?

[attachment id=0 msg=901]

Regards;
Richard Moss
Read "Before You Post" before posting (https://forums.cyotek.com/cyotek-webcopy/before-you-post/). Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.

matrix1999

I am currently using version 1.7.0.600.  Should I try the 1.8 nightly builds instead?

As for app, I don't currently have app that opens .epub files (not on the machine that I run WebCopy anyway).


Richard Moss

Hello,

Thanks for the follow up. 1.8 probably won't help, I'll try and set aside some time to do a debugging session today. The fact it works for me and not for you suggests that it's not looking up the mime type correctly and using a machine dependent fallback. If this is the case should be easy enough to correct.

Regards;
Richard Moss
Read "Before You Post" before posting (https://forums.cyotek.com/cyotek-webcopy/before-you-post/). Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.

Richard Moss

Hello,

Unfortunately I'm still not able to reproduce this, even on a clean Windows 7 VM. Assuming you saved your project with its results, are you able to share this so I can see what settings are defined and URI data stored?

Regards;
Richard Moss
Read "Before You Post" before posting (https://forums.cyotek.com/cyotek-webcopy/before-you-post/). Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.

matrix1999

I had to restart my project several times as my machine crapped out several times during the run of the project (nothing to do with Webcopy but I think my computer needs better ventilation and that's another story altogether).  In any case, let me finish the project and I'll load the project file and log all at once.

Cheers!