Recent Posts

Pages: [1] 2 3 ... 10
1
WebCopy / Re: Problem with images not being downloaded
« Last post by Dractyas on June 13, 2021, 07:24:20 PM »
Hello

Didn't any of you have this problem and managed to solve it?
If you managed to solve it, I would be grateful if you could tell me how you did it.

I'm looking at the software's user manual and I can't find something that can solve the problem, maybe someone of you knows how to solve it.

Can someone who has participated in the development of the software help me?

Thanks.
2
WebCopy / Problem with images not being downloaded
« Last post by Dractyas on June 08, 2021, 08:28:09 PM »
Hello

I do not understand the English language well, sorry for any errors.

I am a new user of the software.
A few weeks ago I installed Cyotek Webcopy on my pc, I downloaded a blog from Blogspot and the download was completed, but when I entered the downloaded blog on my hard drive I found that some images are missing from the original blog.
I want to ask you if it is possible to change any configuration and then update the download again in order to get the missing images to download.
I don't remember seeing error messages referring to the pages where these images are, but I can try to check that if I find a "log" file.
If this is possible, I thank you for explaining to me how to achieve it.

I appreciate your help.
Thanks.
3
ImageBox / Using Image and Selection
« Last post by splaisance76 on June 07, 2021, 05:55:45 PM »
I am trying to set up an Imagebox where the user can select an image to display.  Then they can select a section of the image to resize.  I am not sure what I am missing and hope you can help.  I looked at the General Demo form and both the Drag Test and Resizable Selection Demo.  When I get the resize box to work, it doesn't display the selected background image.  When I get the background image correct, the resize box doesn't show as you are doing the selection.

Thanks!
4
WebCopy / YouTube GoogleMap URL's and Javascript
« Last post by DABookerSr on June 04, 2021, 03:01:00 PM »
I have read the notes on Javascript rendered URL's and that WebCopy can't render the URL however, I have an example of each where v 1.60 outputs the correct URL where version 1.8.3 768 does not.  In both example a JQuery script is used to display the output in a lightbox. So JS is involved yet it works fine in 1.60.

For example:

Google Maps:
Actual Link
<a href="https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d3107.5511364326253!2d-104.8256144!3d38.8427395!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x8713453efd9467e9%3A0x84f150b38648fe4d!2s610%20N%20Cascade%20Ave%2C%20Colorado%20Springs%2C%20CO%2080903!5e0!3m2!1sen!2sus!4v1605376728866!5m2!1sen!2sus" class="fancybox-iframe"><i class="gmap"></i></a>

1.60 build 559
<a href="https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d3107.5511364326253!2d-104.8256144!3d38.8427395!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x8713453efd9467e9%3A0x84f150b38648fe4d!2s610%20N%20Cascade%20Ave%2C%20Colorado%20Springs%2C%20CO%2080903!5e0!3m2!1sen!2sus!4v1605376728866!5m2!1sen!2sus" class="fancybox-iframe"><i class="gmap"></i></a>

1.8.3.768
<a href="../../maps/embed.html?pb=!1m14!1m8!1m3!1d3107.5511364326253!2d-104.8256144!3d38.8427395!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x8713453efd9467e9%3A0x84f150b38648fe4d!2s610%20N%20Cascade%20Ave%2C%20Colorado%20Springs%2C%20CO%2080903!5e0!3m2!1sen!2sus!4v1605376728866!5m2!1sen!2sus" class="fancybox-iframe"><i class="gmap"></i></a>

YouTube
Actual Link
<a class="fancybox-youtube" href="https://www.youtube.com/watch?v=GFQEDKAJFso"><i class="video"></i> Station Cove Falls</a>

1.6.0 build 559
<a href="https://www.youtube.com/watch?v=GFQEDKAJFso" class="fancybox-youtube"><i class="video"></i><br>Station Cove Falls</a></p>

1.8.3.768
<a href="../../watch-1.html?v=GFQEDKAJFso" class="fancybox-youtube"><i class="video"></i><br>Station Cove Falls</a>

Thanks,
dennis



5
WebCopy / Re: Bypass "Continue" button
« Last post by Poddans on May 26, 2021, 03:26:54 PM »
Hi Richard,

Thank you for replying to my post!

Hello,

By preview window I assume you mean the embedded webbrowser used for capturing a form or external login? This is embedded Internet Explorer and so isn't linked to WebCopy. It also uses your global OS cookies so if you logged into it, then it has likely kept that cookie. It shouldn't have anything to do with the current WebCopy version.

That is correct and this explains that part for me. I might have logged in to the site at the same time I switched versions which would explain the difference I experienced.

If the button isn't displayed when you are logged into the service, then using one of WebCopy's supported authentication methods (posting a form, manual authentication or copying cookies) should work. It is also possible that it uses user agent sniffing, so perhaps it is worth changing the agent to match that of an existing browser.

This may very well be a user error so don't put to much thought into it. I did try to use the "form post" to log in to the site but it opened a browser window to the primary site (the one I had typed into the field "Website" in the main WebCopy window) while the one I had to log in to was listed under "Additional hosts".

I did think that the cookie would solve all my problems when I finally got it to work but no dice. Again, it may very well have been me just thinking I got it to work when in fact an error was made somewhere.

Edit: I forgot to mention, but sample URLs would be of great benefit. Unless they link to NSFW sites in which case please don't post these.

I ran into another one of your posts on this forum where NSFW links were mentioned and that's why I didn't provide an example even though I know that probably would've been helpful.

The image host we're talking about in this thread was 1 of 5 used by the site I was trying to copy. The other 4 worked just fine with my settings so I suspect this particular host has taken some measures to make it more difficult to crawl.

I don't know how common this problem is for the users of your software, but if it's something that does come up enough to make you want to take an even closer look at it there is a greasemonkey script called "Handy Image" (Sleazyfork link). This script bypasses "Continue to image" for a whole lot of image hosts and sends you directly to www.host.com/name.jpg. Maybe there's something in that script that you would find useful.

I managed to solve my problem in the end by using WebCopy to gather all the links I needed to the image host and then using another software to download the image within those links. So, all's well that ends well.

Again, thanks a lot for taking the time to reply. :)
6
WebCopy / Re: Image problem still
« Last post by Richard Moss on May 25, 2021, 06:46:56 AM »
Hello,

Have you tried the tutorial for copying only images?
https://docs.cyotek.com/cyowcopy/current/examplecopyimages.html

Regards;
Richard Moss
7
WebCopy / Re: Bypass "Continue" button
« Last post by Richard Moss on May 25, 2021, 06:44:03 AM »
Hello,

By preview window I assume you mean the embedded webbrowser used for capturing a form or external login? This is embedded Internet Explorer and so isn't linked to WebCopy. It also uses your global OS cookies so if you logged into it, then it has likely kept that cookie. It shouldn't have anything to do with the current WebCopy version.

Unfortunately forms won't really work for this scenario as they are meant for authentication and so WebCopy doesn't scan the posted content for links, it only reads cookies and checks for redirects. I've logged issue #409 to revisit this functionality.

I've also logged issue #410 to consider a feature upgrade to forms which would allow you to dynamic run these by matching a URL and an expression for the form within the page. This isn't something the software should do automatically as we don't really need a crawler to start clicking delete buttons ;)

If the button isn't displayed when you are logged into the service, then using one of WebCopy's supported authentication methods (posting a form, manual authentication or copying cookies) should work. It is also possible that it uses user agent sniffing, so perhaps it is worth changing the agent to match that of an existing browser.

Edit: I forgot to mention, but sample URLs would be of great benefit. Unless they link to NSFW sites in which case please don't post these.

Regards;
Richard Moss
8
WebCopy / Re: Feature request: parent URL field in filtering rules
« Last post by Richard Moss on May 25, 2021, 06:28:21 AM »
Hello,

Thanks for the feedback, that is an interesting idea. I've logged issue #408 to consider if and how this can be added.

Regards;
Richard Moss
9
WebCopy / Re: %3A = : et %2F = / not possible
« Last post by Richard Moss on May 25, 2021, 06:24:51 AM »
Hello,

I checked that forum but this does not contain any links such as you describe. The _text_ does, but as I've already mentioned (and the link post discusses) these aren't valid URLs.

Regards;
Richard Moss
10
WebCopy / Bypass "Continue" button
« Last post by Poddans on May 24, 2021, 04:33:43 PM »
I'm trying to copy a site that contain links to an external image host. The rules seems to be set up correctly as all the relevant sites from both the source host and the image host are being scanned properly.

However, once the crawler arrives at the site that contains the image it's being stopped in its track by a "Continue to image" button. It saves everything else that's available on this page except the jpeg which is only displayed after clicking the button.

This button has no link attached to it which means it's not being picked up by WebCopy. I've opened the same link in my browser with Javascript turned on/off and the site works the same regardless so it doesn't seem to be controlled by JS.

I've messed around a bit with "Form properties" and here's where it gets a bit interesting.

Using an older version of WebCopy, I input a link to the image host and the preview is identical to that of my web browser. I'm faced with the "Continue to image" button. This is what I found in the raw data:

Code: [Select]
<form action="" method="POST">
<input id="continuebutton" type="submit" name="imgContinue" value="Continue to your image..." />
</form>

After updating to the latest version and repeating the process above, the preview window takes me directly to the image, no button in sight. However, the scan and copy process still behaves the same as before.

I've tried creating an account at the image host and using the credentials in WebCopy to get around having to press the button to see the image.

I've added Post values using the code above but I'm not very good at it and my trial and error didn't seem to lead anywhere.

I would appreciate any idea on what it is I'm missing. As I stated, everything else seems to be set up properly as the relevant page is scanned and downloaded. I tried the same settings on another image host that doesn't have that button and everything downloaded fine.
Pages: [1] 2 3 ... 10