pet medications no prescription http://medsnoprescriptiononline.com/ california institute for regenerative medicine

Author Topic: Any way to 'scan' this?  (Read 683 times)

KneeDeep

  • Newbie
  • *
  • Posts: 3
  • Karma: +0/-0
Any way to 'scan' this?
« on: March 08, 2018, 12:58:30 AM »
Hello Richard,
Any way WebCopy can scan this type of url?

https://www.diveboard.com/msantilli/D7BwPpz#map
- Next to Spot there is a url that is not clickable, any way to scan/collect that location? Below is the part of the HTML used. Any help would be great.

<input value="https://www.diveboard.com/explore/spots/saint-lucia/soufriere-L4TtYZQ/supermans-flight-S6YRNgV" disabled="" style="width: 600px;" type="text">

Richard Moss

  • Cyotek Team
  • Administrator
  • Sr. Member
  • *****
  • Posts: 278
  • Karma: +17/-0
    • cyotek.com
Re: Any way to 'scan' this?
« Reply #1 on: March 08, 2018, 05:29:20 PM »
Hello,

Yes, this should be possible although at the same it could lead to odd results if there are other populated input elements on the site that aren't URL related.

If you open the Project Properties dialog and select the Custom Attributes category, this has an area where you can enter custom xpath expressions to match attribute or element values.

I did a quick test and I believe that if you add //input[@type='text'][@disabled][@value]/@value to the expression list, this will do what you need. If you aren't familiar with xpath, this will basically look for any input element that has a type attribute with the value text and also has value and disabled attributes.

Hope this helps, if not please let me know.

Regards;
Richard Moss

Edit: Updated to include the final /@value otherwise it would return the element instead of the attribute.
« Last Edit: March 08, 2018, 06:18:51 PM by Richard Moss »

Richard Moss

  • Cyotek Team
  • Administrator
  • Sr. Member
  • *****
  • Posts: 278
  • Karma: +17/-0
    • cyotek.com
Re: Any way to 'scan' this?
« Reply #2 on: March 15, 2018, 07:01:25 PM »
Hello,

Did you delete your original reply to the message?

I've just done some testing and have the following observations

If I have source HTML of <img data-original="/assets/img/background5.png" src="/assets/img/background1.png" alt="Background" style="width:  100%;" />, then the expression //img/@data-original will pull out the background5.png. So far so good.

With the source HTML <input type="text" disabled="disabled" value="/assets/img/background6.png"> the expression //input[@type='text'][@disabled][@value] will pull out background6.png. It will also generate 404's for text and disabled as WebCopy will try and match any attribute in the expression to avoid making it even more complicated for the user. I hadn't noticed earlier as I'd only tested with expressions containing a single attribute. Something to fix, but I was able to pull out the image URL's that normally wouldn't be matched.

Regards;
Richard Moss