Images broken and website css after downloading

Started by projectteam, February 18, 2016, 01:20:23 PM

Previous topic - Next topic

projectteam

Hello,
i am trying to copy  website. When i add the url and select "Copy website", when i open the index.html output file i see the images.

eg: C:/cyotek/demo.cwp

The src of the images while copying, are shown and are from the original website eg.http://www.mywebsite.com/data/images/15533.jpg but not from the local folder that is created.

When the copy is finished and i refresh the local index file, the localwebite is broken, the src are changed to see the local images (but not shown) and the css is not rendered. Only the links are shown.

Is there any security issues probably that prevents the rendering and breaks the css??
I am attaching 2 images the original and the local file (its in Greek).

Original: http://send-picture.com/img-56c5c4c1bb4a1.html
Local: http://send-picture.com/img-56c5c4f0d5e9e.html


Thank you
-Konstantinos

Richard Moss

Hello,

Thanks for the message and I'm sorry you're having troubles with WebCopy.

If you open up the "local index file" in a text editor and look at the <link> and <img> tags, what do the source paths look like? Do they seem to be valid or are they pointing to some missing location? Is the extension for the index file .html (or variant therefore) or something different like .php.

It's rather difficult to diagnose this issue without seeing raw content and being able to compare that with original content so any other information you can provide would be helpful.

Thanks;
Richard Moss
Read "Before You Post" before posting. Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.

projectteam

Hi,
the files are located in the local folder but when i use firebug is says that the image source cannot be located.

Also, the files are .html and not .php. Can i force the cyotek to always store html files?


I am sending you the cwp file to check by your side.

https://www.wetransfer.com/downloads/58a1c8b348ef19f6802801cff4f59f0a20160219075101/83c22b025393ca9a037d33fb6230d1c420160219075101/bbb1af

Thank you in advance

projectteam


Richard Moss

Hello,

Apologies for not getting back to you earlier. I've looked into the problem and it looks like a bug in WebCopy when the URL contains spaces. WebCopy is creating folders with the encoded version of the URL segment, eg Harley%20Davidson rather than the Harley Davidson decoded version, so your web browser can't find the images to display, even though they have been downloaded.

The page is also making use of a <base> element with an absolute URI which is forcing the mapped images and other resources to be loaded from the original website. While WebCopy supports the <base> element in order to determine how to combine URL's, it should probably remove it completely from the HTML. I need to consider this one and write a couple of scenarios to test.

Unfortunately there's not much you can do to work around these issues without manual work - for example deleting the <base> element from all downloaded HTML, and renaming all the folders containing percentage signs... not much fun there. I'll have fix for the folder encoding issue available in the next few days, I'm not so sure in the base issue yet - as I said I need to think that one over first.

Regards;
Richard Moss
Read "Before You Post" before posting. Do not send me private messages. Do not expect instant replies.

All responses are hand crafted. No AI involved. Possibly no I either.

projectteam

Hello,
i tried another programm and it works.

Anyway thanks for your time.