Main Menu

Recent posts

#41
WebCopy / hx-get attribute
Last post by rooot - February 02, 2024, 12:40:00 PM
Do you plan to support the hx-get attribute? It makes the browser issue a GET request to another URL and inserts the contents into the current page. This can also happen recursively, with those loaded contents in turn containing hx-get attributes that load contents from more URLs.

Explanation from here:

QuoteThe hx-get attribute will cause an element to issue a GET to the specified URL and swap the HTML into the DOM using a swap strategy:
  <div hx-get="/example">Get Some HTML</div>This example will cause the div to issue a GET to /example and swap the returned HTML into the innerHTML of the div.
#42
WebCopy / Re: some Suggestions
Last post by Paniz - January 26, 2024, 03:06:21 PM

For undownloaded URLs, 'relative' Links do not change to 'absolute' equivalent

  I think most people expect and prefer every Link
  to point to the correct destination, when browsing offline

  i.e., the link should point to file, if downloaded
  and point to URL, if not downloaded

  1) WebCopy changes links to point to the downloaded file

     ( e.g. < a href="[https://www.DownSite.com]/dir/faq.html" ...
            < img src="[https://www.DownSite.com]/images/banner.gif" ...

            will change to

            < a href="RootWC\dir\faq[-n].html" ...
            < img src="RootWC\images\banner[-n].gif" ...     n = 1, 2, ... )

  2) But for undownloaded URLs, WebCopy does nothing

     2-1) For absolute links, it's OK
     2-2) But for correct offline-browsing, relative Links
          must be converted to their absolute equivalent, in this case,

          ( e.g. < a href="[/]dir/faq.html" ...
                 < img src="[/]images/banner.gif" ...

                 should be changed to

                 < a href="https://www.DownSite.com/dir/faq.html" ...
                 < img src="https://www.DownSite.com/images/banner.gif" ... )


Unlike WebCopy, other Web-crawlers offer more options to the user


  ※ 'Cyotek' WebCopy 1.9.1.872: Customize project properties

    ■ General

      ○ Local Files

        ◊ File Options

          [√] Remap references within downloaded files

    Remapping local files

      Although WebCopy will attempt to generate local filenames
      that match the remote URL, this isn't always possible
      In addition, some websites use absolute URLs rather than relative ones
      This means that HTML content downloaded by WebCopy may not link to local files correctly
      WebCopy can solve this by changing the downloaded HTML
      to remap any detected links to point to their local versions


  The Late 'Tennyson Maxwell' Teleport Pro/Ultra/VLX :  ‹ http://www.tenmax.com/

    ■ Offline 'Browsing/Mirroring' Tab

      [√] Replicate the directory structure of remote servers     ‹#1›

      ◊ Linkage System

        [√] Localize links for retrieved files     ‹#2›

            ◊ Links for unretrieved files

              ( ) Link to a message that explains why that file was not retrieved     ‹#3-1›

              (•) Link to the Internet address for the file                           ‹#3-2›

              ( ) Link to a place where the local file will be stored                 ‹#3-3›

              ┌─────────────────────────────────────┐
              Relink all files in the project now     ‹#4›
              └─────────────────────────────────────┘

    ‹#1›: It allows you to make a mirror/an exact duplicate of a website, if enabled
    ‹#2›: If you want to offline browse a site, you should enable this option

    ‹#3-1›: The message will also contain a link directly to the Internet
    ‹#3-2›: Rewrite the link so that it is "externalized" to point back out to the Internet
    ‹#3-3›: "predictively" link to a blank location where that file will be stored,
              when the file is retrieved later

    ‹#4›: This button can be used to immediately rewrite the links
            for all HTML files in the project folder,
            using the current linkage system specifications


  ※ 'MetaProducts' Offline Explorer ‹OE› Standard/Pro/Enterprise :  ‹ http://www.metaproducts.com/

    ■ Advanced: Link Conversion

      ◊ Select link translation mode: Change links in downloaded files

        ( ) Offline Conversion
            All links in downloaded files point to offline locations
            regardless of whether they were downloaded or not

        (•) On-line Conversion
            Only the links that are allowed for download will point to offline locations
            Restricted links (because of insufficient Level setting, URL Filters limitations, etc.)
            will point to their original online documents and files

        ( ) No Conversion
            Create a mirror copy of a Website with all original links
            (Warning: It might be impossible to browse a Project offline
            if it was downloaded using this mode)

        ( ) Mixed
            Keep all restricted links intact. All downloaded links will point offline


    This section allows you to define how the HTML links should be converted

      Unless the No Conversion mode is selected,
      Offline Explorer Enterprise will translate every link
      that should be loaded to enable offline browsing when the download is complete

    Offline Conversion
      Every link that is not loaded will be translated to the offline location
      This prevents dialup dialog boxes from popping up when browsing offline

    On-line Conversion
      Change every link that should be not loaded to the online (Web) location
      This may cause dialup dialog boxes to appear when browsing sites offline

    No Conversion
      Check only if you want to get exact copies of downloaded HTML pages
      Offline Explorer Enterprise will not translate any links in these files
      if this setting is checked
      You will be unable to browse downloaded Web sites offline

    Mixed
      Keep every link that should be not loaded intact
      All links that should be downloaded will point to offline locations


I wished WebCopy : [√] Remap references within downloaded files

  was like
  Teleport : Links for unretrieved files : (•) Link to the Internet address for the file
  or
  OE : Link Conversion : (•) On-line Conversion

  but unfortunately, it is like
  OE : Link Conversion : (•) Mixed
#43
ImageBox / Selection on image
Last post by Deer2014 - January 24, 2024, 09:05:42 PM
Hi Richard,

With the mouse drag I create a selection on an image. When I click the mouse inside the image loaded into ImageBox, but outside the selection, I want the selection to disappear. Now I find that it disappears after repeated clicks, sometimes it does not. How can this be fixed? Can you help me with this?

Thanks.

Regards,
Deer2014
#44
WebCopy / Re: 'URL Transforms' do not wo...
Last post by Paniz - January 23, 2024, 02:04:10 AM

By looking into the source, I found the following lines in a CSS file,

  ↓ https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/css/fontawesome-all.min.css

    src:url(../webfonts/fa-brands-400.eot);
    src:url(../webfonts/fa-brands-400.eot?#iefix) ...
    src:url(../webfonts/fa-regular-400.eot);
    src:url(../webfonts/fa-regular-400.eot?#iefix) ...
    src:url(../webfonts/fa-solid-900.eot);
    src:url(../webfonts/fa-solid-900.eot?#iefix) ...

So, I think the correct 'URL Transforms' is
' Expression: \.eot\?.* , Replacement: .eot'
not
' Expression: \.eot\?$  , Replacement: .eot'

Anyway, 'URL Transforms' still doesn't work

Maybe 'URL Transforms' only works for  "absolute" URLs  in HyperTexts ?


Suppose a Website is set up to run over both  HTTP  and  HTTPS
  and also the  non-WWW links  do NOT redirect to the  WWW links  (or vice-versa)

  http[s]://[www.]DownSite.com/index.html

  If all the links in HyperTexts are relative, that won't be a problem

  (e.g. < a href="[/]dir/faq.html" ...
        < img src="[/]images/banner.gif" ... )

  but just one absolute link with  different protocol  or  different prefix (WWW or nothing)
  can cause a Website to be downloaded twice

  (e.g. < a href="http[s]://[www.]DownSite.com/dir/faq.html" ...
        < img src="http[s]://[www.]DownSite.com/images/banner.gif" ... )

  So, in theory, a Website may be downloaded up to 4 times per session

  e.g.
  URL1: ‹https://www.DownSite.com/dir/faq.html File1: ‹RootWC\dir\faq.html
  URL2: ‹https://DownSite.com/dir/faq.html›      File2: ‹RootWC\dir\faq-1.html
  URL3: ‹http://www.DownSite.com/dir/faq.html›  File3: ‹RootWC\dir\faq-2.html
  URL4: ‹http://DownSite.com/dir/faq.html›      File4: ‹RootWC\dir\faq-3.html

  Ticking the option ‹ URL Normalization: [√] Ensure internal links match domain prefix
  helps URL2 is interpreted as URL1 and URL4 as URL3

  but to make WebCopy take URL1 & URL3 the same,
  I have no option but to ‹ Force HTTPS ›  ( i.e., URL3 as URL1, not vice-versa )

  ‹ Force HTTPS: (•) Always   or   (•) Only for the following hosts: www.DownSite.com

  What if a Website has a bad SSL Certificate and HTTP is preferable ?

  What if  Mixed Content ?
  ( i.e., the Website is being loaded over  HTTPS,
         but some of its resources are being loaded over  HTTP )

  In these cases, forcing  HTTPS  to avoid duplication
  can result in the loss of some URLs

  i.e., some URLs cannot be retrieved due to forcing the HTTPS protocol


With MetaProducts Offline Explorer ‹OE› Enterprise: 'URL Substitutes',
  which is  case-insensitive
  based on the Website specifications, the user can have a wide variety of Choices


  Apply to | URL | Replace     | With
  -------- | --- | -------     | ----

  [URLs]   |  *  | https://    | http://  Force HTTP  for  all hosts
  or
  [URLs]   |  *  | http://     | https:// Force HTTPS for  all hosts

  [URLs]   |  *  | ://www.     | ://      Remove WWW prefix from  all hosts
  or
  [URLs]   |  *  | ://         | ://www.
  [URLs]   |  *  | ://www.www. | ://www.  Add WWW prefix to  all hosts


  Apply to | URL                 | Replace      | With
  -------- | ---                 | -------      | ----

  Force HTTP , Remove WWW for  [www.]DownSite.com
  [URLs]   | ://DownSite.com     | https://     | http://
  [URLs]   | ://www.DownSite.com | http*://www. | http://

  Force HTTPS, Remove WWW for  [www.]DownSite.com
  [URLs]   | ://DownSite.com     | http://      | https://
  [URLs]   | ://www.DownSite.com | http*://www. | https://

  Force HTTP , Add WWW for  [www.]DownSite.com
  [URLs]   | ://DownSite.com     | http*://     | http://www.
  [URLs]   | ://www.DownSite.com | https://     | http://

  Force HTTPS, Add WWW for  [www.]DownSite.com
  [URLs]   | ://DownSite.com     | http*://     | https://www.
  [URLs]   | ://www.DownSite.com | http://      | https://


  By changing 'Apply to: 1) URLs' to 'Apply to: 2) Filenames',
  a matched URL is "virtually" converted to another URL

  If the converted URL has already been downloaded, nothing will be downloaded
  and the matched URL will be interpreted offline to point to the downloaded file

  If not, the matched URL will be downloaded as it is,
  but it is treated as if it was the converted URL
  i.e., the offline directory structure and filename for the downloaded file
  will be based on the converted URL, not the matched URL


  Or as is written in its manual,

  « Note: Apply to filenames means using the URL substitution rule only for downloaded files.
  I.e., URLs will be downloaded as they are,
  but offline filenames and links will be changed according to the substitution rule »



  So, the following lines mean,

  « Download as it is, if the converted URL has not already been downloaded
  but after download or not, consider it as ... »



  Apply to    | URL | Replace     | With
  --------    | --- | -------     | ----

  ... consider it as     http://etc
  [Filenames] |  *  | https://    | http://
  or
  ... consider it as     https://etc
  [Filenames] |  *  | http://     | https://

  ... consider it as     http[‎s]://etc
  [Filenames] |  *  | ://www.     | ://
  or
  ... consider it as     http[‎s]://www.etc
  [Filenames] |  *  | ://         | ://www.
  [Filenames] |  *  | ://www.www. | ://www.


  Apply to    | URL                 | Replace      | With
  --------    | ---                 | -------      | ----

  ... consider it as     http://DownSite.com/etc
  [Filenames] | ://DownSite.com     | https://     | http://
  [Filenames] | ://www.DownSite.com | http*://www. | http://

  ... consider it as     https://DownSite.com/etc
  [Filenames] | ://DownSite.com     | http://      | https://
  [Filenames] | ://www.DownSite.com | http*://www. | https://

  ... consider it as     http://www.DownSite.com/etc
  [Filenames] | ://DownSite.com     | http*://     | http://www.
  [Filenames] | ://www.DownSite.com | https://     | http://

  ... consider it as     https://www.DownSite.com/etc
  [Filenames] | ://DownSite.com     | http*://     | https://www.
  [Filenames] | ://www.DownSite.com | http://      | https://


By UnTicking the option ‹ URL Normalization: [ ] Ensure internal links match domain prefix
  Ticking the option ‹ URL Normalization: [√] Ignore case (not recommended)
  and setting ‹ Force HTTPS: (•) Never
  I expect similar functionality of OE: 'URL Substitutes: Apply to: 1) URLs'
  from WebCopy: 'URL Transforms', but it doesn't occur


  To be more precise, I guess 'URL Transforms' only works for
  absolute URLs  in HyperTexts and ignores  relative URLs



  Expression      | Replacement
  ----------      | -----------

  [√] https://    | http://    Force HTTP  for  all hosts  [?]
  or
  [√] http://     | https://    Force HTTPS for  all hosts  [?]

  [√] ://www\.    | ://        Remove WWW prefix from  all hosts  [?]
  or
  [√] ://(www\.)? | ://www.    Add WWW prefix to  all hosts  [?]


  Also, I find out that the 'URL Expression' field
  is a condition for parent URL, not the URL itself
  So, the setting below is incorrect


  Expression      | Replacement | URL Expression
  ----------      | ----------- | --------------

  Force HTTP  for  [www.]DownSite.com  [×]
  [√] https://    | http://     | https?://(www\.)?DownSite\.com
  or
  Force HTTPS for  [www.]DownSite.com  [×]
  [√] http://     | https://    | https?://(www\.)?DownSite\.com

  Remove WWW prefix from  [www.]DownSite.com  [×]
  [√] ://www\.    | ://         | https?://(www\.)?DownSite\.com
  or
  Add WWW prefix to  [www.]DownSite.com  [×]
  [√] ://(www\.)? | ://www.     | https?://(www\.)?DownSite\.com


  Instead, I must use the setting below,
  and that the 'URL Expression' field is useless in this case
  but adds extra functionality to WebCopy that OE lacks


  Expression                         | Replacement
  ----------                         | -----------

  Force HTTP , Remove WWW for  [www.]DownSite.com  [?]
  [√] https?://(www\.)?DownSite\.com | http://DownSite.com

  Force HTTPS, Remove WWW for  [www.]DownSite.com  [?]
  [√] https?://(www\.)?DownSite\.com | https://DownSite.com

  Force HTTP , Add WWW for  [www.]DownSite.com  [?]
  [√] https?://(www\.)?DownSite\.com | http://www.DownSite.com

  Force HTTPS, Add WWW for  [www.]DownSite.com  [?]
  [√] https?://(www\.)?DownSite\.com | https://www.DownSite.com

#45
WebCopy / HOW TO: correctly configure to...
Last post by ketanchandaria@gmail.com - January 17, 2024, 01:29:52 AM
Hi Richard,

I'm trying to DOWNLOAD from web.archive.org a particular SNAPSHOT in time of an old version of a website along with all the downloadable links to materials like PDF, DOCX, XLSX, ZIP, etc.. and the issue i'm facing is with how to accurate set up the Project Properties ....

1. It downloads multiple snapshot in time (not the particular date but other links within the date calendar of the wayback timemachine
2. It does not download the pages even if i can see that it has loaded (sometimes the page does not load instantly and gives an error page upon clicking any navigation BUT after a few seconds the browser automatically loads the page
3. It does not download any downloadable links to external documents
4. It creates multiple folders with different alpahnumberic code (i presume its the different dates) within the directory web.archive.org>web
5. It does not download any HTML file nor any content within that HTML file.

I'm seeking to downlaod this entire website (all the main navigation links) so i can refer everything (including the downloaded materials) offline on my PC = https://web.archive.org/web/20201016171853/https://www.lean.ohio.gov/Home.aspx

Please can you help me in this and provide me with the exact settings i need to do in the Project Properties or anywhere else.

Thanking you
KC
#46
WebCopy / some Suggestions
Last post by Paniz - January 14, 2024, 05:18:57 PM

───< Suggestion#1: >───

♦ Description:

  In the next versions, 'URL Transforms' can be upgraded to
  something like
MetaProducts OE Enterprise: 'URL Substitutes',
  as shown below:



  Apply to   | URL | Replace | With
  --------   | --- | ------- | ----
↓ List Box ↓ | ... |   ...   | ...
  1) URLs
  2) Filenames
  3) Disabled
  4) HTML Text


  'URL Substitutes: Apply to: 1) URLs' is almost equivalent to 'URL Transforms'

  'URL Substitutes: Apply to: 2) Filenames' is a feature that WebCopy lacks
  This mode has no effect on URLs, but gives the user the ability
  to change the default  'Relative Path'  or/and  'Filename'
  when saving File on Drive

  Unlike WebCopy, which enjoys a strong Regular Expressions Library
  OE is poor in Regular Expressions support



───< Suggestion#2: >───

♦ Description:

  Replication of the directory structure, in non-Flatten mode, is a bit odd to me


Suppose a Site-Map, as listed below:

  https : // www . DownSite . com /index.html

  ↓ ‹index.html› has references
    (i.e. < img src="... , < link rel="stylesheet" href="... , < script src="...)
    to the following URLs

  https : // www . DownSite . com /Pic1.gif
  https : // www . DownSite . com /Pic2.jpg
  https : // www . DownSite . com /images/Pic3.png

  https : // SubD . DownSite . com /Pic2.jpg
  https : // SubD . DownSite . com /images/Pic3.png
  https : // SubD . DownSite . com /css/main.css

  https : // www . OutSite . net /Pic1.gif
  https : // www . OutSite . net /css/main.css


Teleport saves the site as below,
  in  [√] Replicate the directory structure of remote servers  mode,


  D:\Fol\DS.tpu   ←   Teleport Ultra Projects (*.tpu)

  D:\Fol\DS\www.DownSite.com\index.html

  D:\Fol\DS\www.DownSite.com\Pic1.gif
  D:\Fol\DS\www.DownSite.com\Pic2.jpg
  D:\Fol\DS\www.DownSite.com\images\Pic3.png

  D:\Fol\DS\SubD.DownSite.com\Pic2.jpg
  D:\Fol\DS\SubD.DownSite.com\images\Pic3.png
  D:\Fol\DS\SubD.DownSite.com\css\main.css

  D:\Fol\DS\www.OutSite.net\Pic1.gif
  D:\Fol\DS\www.OutSite.net\css\main.css


WebCopy saves the site as below,
  in non-Flatten mode,


  D:\Fol\DS.cwp   ←   WebCopy/Website Copy Projects (*.cwp)
  RootWC = C:\Downloaded Web Sites\ or ... [www.DownSite.com\]


  RootWC\index.html

  RootWC\Pic1.gif
  RootWC\Pic2.jpg
  RootWC\images\Pic3.png

  RootWC\Pic2-1.jpg
  RootWC\images\Pic3-1.png
  RootWC\css\main.css

  RootWC\Pic1-1.gif
  RootWC\css\main-1.css


I think what Teleport does, is a "TRUE" Replication
It simply shows me, which File belongs to which Domain/Sub-Domain

This is important, especially when we wanna know at a glance
which downloaded Executable-File belongs to which domain
when browsing the directory structure

You can add this functionality as a new feature to WebCopy

Something like below,

  ◊ Folder Options

    [ ] Flatten website folder

  ↓ Can be replaced by:

  ◊ Folder Options

    Directory structure:
    ↓ List Box ↓
      1) Flatten
      2) Replicate  (default)   
←   Like WebCopy, in non-Flatten mode
      3) Replicate, including Domain   
←   Like Teleport


───< Some trivial Graphical issues: >───

♦ There is no download progress indicator

  Unlike other Web-crawlers, WebCopy does not show to the user
  how much of a file (in kilobytes or percent, in 'Size' column)
  has been downloaded, during Website copy


♦ The 'Apply' button is always enabled

  When I open a 'Setting' dialogue, I expect the 'Apply' button
  to be disabled until I change something in one or more Controls
  (i.e. TextBox, ListBox, ComboBox, CheckBox, RadioButton)
  and to become disabled again when I click on it
  after saving changes

  ↓ 'Setting' dialogues are,

    'Tools' Menu: Options...
    'Project' Menu: Project Properties...
    'Project' Menu: Copy Settings...
    'Project' Menu: Additional Hosts...
    'Project' Menu: Content Types...
    'Project' Menu: Rules...
    'Project' Menu: Forms and Passwords...
    'Project' Menu: Proxy Server...
    'Project' Menu: User Agent...

  This can be done by setting the 'Change' event handler
  of every Control on the Form to enable the 'Apply' button



And in the end, sorry for too much text and ideas

Regards and Happy New Year


#47
WebCopy / 'URL Transforms' do not work
Last post by Paniz - January 14, 2024, 03:54:47 PM

Hi,

I'm new to WebCopy, but I have experiences with other Web-crawlers

Surprisingly, I found WebCopy almost as light as Teleport and almost as powerful as OE
Two contrasting advantages, in one package, plus some unique features


  The Late 'Tennyson Maxwell' Teleport Pro/Ultra/VLX: ‹http://www.tenmax.com/

  'MetaProducts' Offline Explorer ‹OE› Standard/Pro/Enterprise: ‹http://www.metaproducts.com/



───< 'Maybe Bug, maybe not' Report: >───

♦ Description:

  'URL Transforms' work fine during Test,
  but are ignored during Link Processing when downloading Files from the Internet



  I crawled a site's content by WebCopy, as per the following settings


  ↓ 'Cyotek WebCopy 1.9.1.872': Customize project properties

  ■ General

    ◊ Website

      URL: https://www.bulkrenameutility.co.uk/

    ◊ Crawl Mode

      (•) Sibling domains

    -----

    [√] Download all resources

    [ ] Limit crawl depth

    [ ] Limit distance from root URL

    ○ Folder

      ◊ Folder Options

        Save folder: C:\BR\

        [ ] Create folder for domain

        [ ] Flatten website folder

        [ ] Empty website folder before copy

            [ ] Use Recycle Bin

        Directory character: [Windows (\)]

    ○ Local Files

      ◊ File Options

        [√] Remap references within downloaded files

        [ ] Update local time stamps

        [ ] Use query string in local file names

      ◊ Remap file extensions by content type

        (•) Only for HTML

        [√] Keep original extension

    ○ Additional Hosts

      ...

    ○ Additional URLs

      ...

    ○ Content Types

      (•) Include all

    ○ Limits

      [ ] Maximum number of files: 0

      -----

      [ ] Minimum File Size: 0 KiB

      [ ] Maximum File Size: 0 KiB

    ○ Forms

      ...

    ○ Passwords

      ◊ Password Options

        [ ] Do not prompt for passwords

        [ ] Log in using web browser

      ◊ Saved Passwords

        ...

    ○ Rules

      Expression                                  |   Options
      ----------                                  |   -------
      [√] forum                                   |   Exclude
      [√] FileKicker/BRU_setup_3\.4\.4\.0\.exe$   |   Exclude

      ◊ Rule Properties

        Compare: [Path] [Matches]
        Expression: forum
        Options: [Exclude]

        ◊ Advanced
          [√] Enable this rule
          [ ] Stop processing more rules
          Download Priority [Normal]

        ↑ for ‹https://www.bulkrenameutility.co.uk/forum/

        -----

        Compare: [Path] [Matches]
        Expression: FileKicker/BRU_setup_3\.4\.4\.0\.exe$
        Options: [Exclude]

        ◊ Advanced
          [√] Enable this rule
          [ ] Stop processing more rules
          Download Priority [Normal]

        ↑ for ‹https://www.bulkrenameutility.co.uk/FileKicker/BRU_setup_3.4.4.0.exe

    ○ Speed Limits

      ◊ Limits

        (•) Limit to requests per second

            Maximum Requests per Second: 1

        -----

        [ ] Enforce Limit Checks

    ○ URL Normalization

      ◊ URL Normalization

        [ ] Honor canonical URI's {Disabled}

        [√] Ensure internal links match domain prefix

        [√] Ignore case (not recommended)

      ◊ Force HTTPS

        (•) Never

    ○ User Agent

        (•) Use custom user agent

            Mozilla/5.0 (Windows NT 6.1; rv:86.0) Gecko/20100101 Firefox/89.0

    ○ Web Browser

      [ ] Use web browser

    ○ Web Page Language

      ...

    ○ Sitemap

      [√] Create site map

      File name: index-wcopy.html

  ■ Advanced

    ◊ Crawl Behaviour

      [√] Use header checking (recommended)

      [√] Always download latest version

      -----

      [ ] Crawl above root URL (not recommended)

      [√] Keep alive (recommended)

      -----

      Origin report: [Create a single report for the entire project]

      [ ] Add to source HTML

    ◊ Abort HTTP Status Codes

      Status Codes:

    ○ Accepted Content Types

      ...

    ○ Cookies

      [ ] Discard session cookies

    ○ Custom Attributes

      ...

    ○ Custom Headers

      ...

    ○ HTTP Compression

      [√] Compress (Legacy)
      [√] Deflate
      [√] GZip
      [√] Brotli
      [√] BZip2

    ○ Link Map

      [√] Save link information in project

          [√] Include headers

      [ ] Clear link information before scan

    ○ Redirects

      (•) Follow internal redirects

      Maximum redirect chain length: 25

    ○ Security

      [√] Use SSL 3.0 (not recommended)
      [√] Use TLS 1.0 (not recommended)
      [√] Use TLS 1.1
      [√] Use TLS 1.2

      -----

      [ ] Ignore certificate errors (not recommended)

    ○ URL Transforms

      Expression         |   Replacement   |   URL Expression
      ----------         |   -----------   |   --------------
      [√] -400\.eot\?$   |   -400.eot      |
      [√] -900\.eot\?$   |   -900.eot      |

      ◊ Transform Properties

        Expression: -400\.eot\?$
        Replacement: -400.eot
        URL Expression:

        ↑ for:
          ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-brands-400.eot?
          ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-regular-400.eot?

        -----

        Expression: -900\.eot\?$
        Replacement: -900.eot
        URL Expression:

        ↑ for: ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-solid-900.eot?

  ■ Deprecated

    ○ Default Documents

      ...

    ○ Domain Aliases

      ...

    ○ Proxy

      ...

    ○ Query Strings

      [ ] Strip query string segments


  There are 6 links in the project, listed below:

  URL  1-1: ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-brands-400.eot
    File 1-1: ‹C:\BR\assets\vendor\font-awesome\webfonts\fa-brands-400.eot

    URL  1-2: ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-brands-400.eot?
    File 1-2: ‹C:\BR\assets\vendor\font-awesome\webfonts\fa-brands-400-1.eot

  URL  2-1: ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-regular-400.eot
    File 2-1: ‹C:\BR\assets\vendor\font-awesome\webfonts\fa-regular-400.eot

    URL  2-2: ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-regular-400.eot?
    File 2-2: ‹C:\BR\assets\vendor\font-awesome\webfonts\fa-regular-400-1.eot

  URL  3-1: ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-solid-900.eot
    File 3-1: ‹C:\BR\assets\vendor\font-awesome\webfonts\fa-solid-900.eot

    URL  3-2: ‹https://www.bulkrenameutility.co.uk/assets/vendor/font-awesome/webfonts/fa-solid-900.eot?
    File 3-2: ‹C:\BR\assets\vendor\font-awesome\webfonts\fa-solid-900-1.eot

  I expect with 'URL Transforms: ■ Expression: \.eot\?$ , ■ Replacement: .eot',
  URLs 1-2, 2-2, 3-2  are considerd as  URLs 1-1, 2-1, 3-1, respectively

  So logically,

  1) I expect  Files 1-1, 2-1, 3-1  (*.eot)
     are not re-downloaded as  Files 1-2, 2-2, 3-2  (*-1.eot), respectively

  2) and that,  URLs 1-2, 2-2, 3-2  are translated offline
     and point to  Files 1-1, 2-1, 3-1  (*.eot), respectively

  But, It does NOT occur ‼ (i.e., 'URL Transforms' has no effect)
#48
WebCopy / SSL/TLS error: Could not estab...
Last post by Paniz - January 14, 2024, 03:43:00 PM
 Hi,

 I tried to copy the following related websites by 'Cyotek WebCopy 1.9.1.872',

 https://www.karenware.com/ , https://www.winett.com/

 but I got the following error message,

 The underlying connection was closed:
 Could not establish trust relationship for the SSL/TLS secure channel.

 Changing the options in the 'Advanced: Security' section didn't help


  ■ Advanced
    ○ Security

      [√] Use SSL 3.0 (not recommended)
      [√] Use TLS 1.0 (not recommended)
      [√] Use TLS 1.1
      [√] Use TLS 1.2
      -----
      [√] Ignore certificate errors (not recommended)

#49
WebCopy / error with this website
Last post by fonarar - January 11, 2024, 10:14:39 AM
I tried downloading this online dictionary website.

https://abadis.ir/

But at the middle of downloading, the webcopy gave me an error, & process stopped.

webcopy downloaded about 625MB from that website, & here is the project's cwp & csv:

https://pixeldrain.com/u/EHkN92K4

https://pixeldrain.com/u/zmMq1xB7

Could someone help me about where I went wrong?
#50
WebCopy / Re: Run webcopy again but avoi...
Last post by szetakyu - January 03, 2024, 05:20:27 AM
Quote from: rumia on July 26, 2023, 08:00:35 PMi am currently using version 1.9.0.822
is there any news regarding the "-1" problem?
Currently I have the same problem although I have set the 3 settings from link on first post.
filename-1.typ and filename.typ files have same sha1 hash

if not, is it possible to get the function, if file exists locally (path\file) then the download should be skipped?

Same question here.