Copying Site that uses password

Started by washa, March 12, 2015, 12:24:47 PM

Previous topic - Next topic

washa

Hi,

I wanted to download my courses that I have registered for on the website: http://www.fenzidogsportsacademy.com.  The course content that I have enrolled for is in this link: http://www.fenzidogsportsacademy.com/index.php/my-courses.  The site downloads using the first address but when I look at what I have downloaded it asks me to login and the goes straight to the web.  I did put my login details in the password screen in WebCopy.  Wonder if anyone has any tips.

thanks

Richard Moss

Hello,

Thanks for your message, and sorry you're having troubles with WebCopy. I took a quick look at the login page you reference and I suspect you have the same problem as other users - basically, there's extra fields present (not just username, password and remember me) which are probably used for additional validation.

As these hidden values are generated for each unique session, currently there is no way that WebCopy can login to such sites as the programs existing behaviour is only submit the content you provide and you can't provide a value you don't know about.

The slightly good news is that I'm in the middle of fixing this bug now. The not so good news is it isn't a quick fix, partly due to replacing some core parts of the crawling engine to make it easier to mock. This means testing is easier as they don't rely on a real world website full of edge cases. Once the tests are passing I then need to update the UI to handle advanced scenarios such as multiple forms on a single page.

I would guess it will  be a minimum of one week, possibly longer before the next update to WebCopy is released - one that should finally put this issue to rest.

Sorry for the inconvenience.

Regards;
Richard Moss

washa

thanks for that Richard. Will be looking forward to the update  :)

Richard Moss

#3
Hello,

WebCopy 1.0.10.0 has just gone live which will hopefully resolve this one once and for all. Seems to be OK, but I can't test the whole internet so I'm sure there's something wrong waiting to be discovered. But if you can give it a whirl and see if it fixes your particular issue that would be helpful.

Edit: Make sure you enable the Merge Values option for your form - it won't be turned on by default for existing projects until I'm resonably sure there are no side effects. There's a blog post which describes the new changes.

Regards;
Richard Moss

washa

thanks Richard.

The password seems to be working.  I still cant seem to copy the website, but it doesn't seem to be the password issue now so something to do with the way they've built the site.  I'll keep trying!

thanks
Wendy

Richard Moss

Wendy,

Thanks for the follow up - glad at least that part is working! In regards to the failure to copy the site, is there anything specific, ie are pages missing from the site map, or to do with the copy itself, ie overwritten or corrupt files? If navigation options in the site are provided via JavaScript WebCopy is going to find it almost impossible to copy the site, but it should be detecting most other links fine, before getting hit by any regressions/bugs.

I'd really like to see some posts on this forum that aren't bug reports for WebCopy so any clues you can give on copy failures that will help improve WebCopy would be great :)

Regards;
Richard Moss

washa

Changed a setting in the properties (crawl above root) and it all works perfectly now  :)

Thankyou, this has saved me so much time having to do it manually  :)

Wendy

Richard Moss

Wendy,

Great, glad you've got that resolved! Generally though I wouldn't have thought you needed to use that setting, or are you only copying a portion of a website and were having problems with assets like CSS or images that were outside this? I suppose doing it that way might be easier than trying to use rules.

Thanks;
Richard Moss