Main Menu

some Suggestions

Started by Paniz, January 14, 2024, 05:18:57 PM

Previous topic - Next topic

Paniz


───< Suggestion#1: >───

♦ Description:

  In the next versions, 'URL Transforms' can be upgraded to
  something like
MetaProducts OE Enterprise: 'URL Substitutes',
  as shown below:



  Apply to   | URL | Replace | With
  --------   | --- | ------- | ----
↓ List Box ↓ | ... |   ...   | ...
  1) URLs
  2) Filenames
  3) Disabled
  4) HTML Text


  'URL Substitutes: Apply to: 1) URLs' is almost equivalent to 'URL Transforms'

  'URL Substitutes: Apply to: 2) Filenames' is a feature that WebCopy lacks
  This mode has no effect on URLs, but gives the user the ability
  to change the default  'Relative Path'  or/and  'Filename'
  when saving File on Drive

  Unlike WebCopy, which enjoys a strong Regular Expressions Library
  OE is poor in Regular Expressions support



───< Suggestion#2: >───

♦ Description:

  Replication of the directory structure, in non-Flatten mode, is a bit odd to me


Suppose a Site-Map, as listed below:

  https : // www . DownSite . com /index.html

  ↓ ‹index.html› has references
    (i.e. < img src="... , < link rel="stylesheet" href="... , < script src="...)
    to the following URLs

  https : // www . DownSite . com /Pic1.gif
  https : // www . DownSite . com /Pic2.jpg
  https : // www . DownSite . com /images/Pic3.png

  https : // SubD . DownSite . com /Pic2.jpg
  https : // SubD . DownSite . com /images/Pic3.png
  https : // SubD . DownSite . com /css/main.css

  https : // www . OutSite . net /Pic1.gif
  https : // www . OutSite . net /css/main.css


Teleport saves the site as below,
  in  [√] Replicate the directory structure of remote servers  mode,


  D:\Fol\DS.tpu   ←   Teleport Ultra Projects (*.tpu)

  D:\Fol\DS\www.DownSite.com\index.html

  D:\Fol\DS\www.DownSite.com\Pic1.gif
  D:\Fol\DS\www.DownSite.com\Pic2.jpg
  D:\Fol\DS\www.DownSite.com\images\Pic3.png

  D:\Fol\DS\SubD.DownSite.com\Pic2.jpg
  D:\Fol\DS\SubD.DownSite.com\images\Pic3.png
  D:\Fol\DS\SubD.DownSite.com\css\main.css

  D:\Fol\DS\www.OutSite.net\Pic1.gif
  D:\Fol\DS\www.OutSite.net\css\main.css


WebCopy saves the site as below,
  in non-Flatten mode,


  D:\Fol\DS.cwp   ←   WebCopy/Website Copy Projects (*.cwp)
  RootWC = C:\Downloaded Web Sites\ or ... [www.DownSite.com\]


  RootWC\index.html

  RootWC\Pic1.gif
  RootWC\Pic2.jpg
  RootWC\images\Pic3.png

  RootWC\Pic2-1.jpg
  RootWC\images\Pic3-1.png
  RootWC\css\main.css

  RootWC\Pic1-1.gif
  RootWC\css\main-1.css


I think what Teleport does, is a "TRUE" Replication
It simply shows me, which File belongs to which Domain/Sub-Domain

This is important, especially when we wanna know at a glance
which downloaded Executable-File belongs to which domain
when browsing the directory structure

You can add this functionality as a new feature to WebCopy

Something like below,

  ◊ Folder Options

    [ ] Flatten website folder

  ↓ Can be replaced by:

  ◊ Folder Options

    Directory structure:
    ↓ List Box ↓
      1) Flatten
      2) Replicate  (default)   
←   Like WebCopy, in non-Flatten mode
      3) Replicate, including Domain   
←   Like Teleport


───< Some trivial Graphical issues: >───

♦ There is no download progress indicator

  Unlike other Web-crawlers, WebCopy does not show to the user
  how much of a file (in kilobytes or percent, in 'Size' column)
  has been downloaded, during Website copy


♦ The 'Apply' button is always enabled

  When I open a 'Setting' dialogue, I expect the 'Apply' button
  to be disabled until I change something in one or more Controls
  (i.e. TextBox, ListBox, ComboBox, CheckBox, RadioButton)
  and to become disabled again when I click on it
  after saving changes

  ↓ 'Setting' dialogues are,

    'Tools' Menu: Options...
    'Project' Menu: Project Properties...
    'Project' Menu: Copy Settings...
    'Project' Menu: Additional Hosts...
    'Project' Menu: Content Types...
    'Project' Menu: Rules...
    'Project' Menu: Forms and Passwords...
    'Project' Menu: Proxy Server...
    'Project' Menu: User Agent...

  This can be done by setting the 'Change' event handler
  of every Control on the Form to enable the 'Apply' button



And in the end, sorry for too much text and ideas

Regards and Happy New Year



Paniz


For undownloaded URLs, 'relative' Links do not change to 'absolute' equivalent

  I think most people expect and prefer every Link
  to point to the correct destination, when browsing offline

  i.e., the link should point to file, if downloaded
  and point to URL, if not downloaded

  1) WebCopy changes links to point to the downloaded file

     ( e.g. < a href="[https://www.DownSite.com]/dir/faq.html" ...
            < img src="[https://www.DownSite.com]/images/banner.gif" ...

            will change to

            < a href="RootWC\dir\faq[-n].html" ...
            < img src="RootWC\images\banner[-n].gif" ...     n = 1, 2, ... )

  2) But for undownloaded URLs, WebCopy does nothing

     2-1) For absolute links, it's OK
     2-2) But for correct offline-browsing, relative Links
          must be converted to their absolute equivalent, in this case,

          ( e.g. < a href="[/]dir/faq.html" ...
                 < img src="[/]images/banner.gif" ...

                 should be changed to

                 < a href="https://www.DownSite.com/dir/faq.html" ...
                 < img src="https://www.DownSite.com/images/banner.gif" ... )


Unlike WebCopy, other Web-crawlers offer more options to the user


  ※ 'Cyotek' WebCopy 1.9.1.872: Customize project properties

    ■ General

      ○ Local Files

        ◊ File Options

          [√] Remap references within downloaded files

    Remapping local files

      Although WebCopy will attempt to generate local filenames
      that match the remote URL, this isn't always possible
      In addition, some websites use absolute URLs rather than relative ones
      This means that HTML content downloaded by WebCopy may not link to local files correctly
      WebCopy can solve this by changing the downloaded HTML
      to remap any detected links to point to their local versions


  The Late 'Tennyson Maxwell' Teleport Pro/Ultra/VLX :  ‹ http://www.tenmax.com/

    ■ Offline 'Browsing/Mirroring' Tab

      [√] Replicate the directory structure of remote servers     ‹#1›

      ◊ Linkage System

        [√] Localize links for retrieved files     ‹#2›

            ◊ Links for unretrieved files

              ( ) Link to a message that explains why that file was not retrieved     ‹#3-1›

              (•) Link to the Internet address for the file                           ‹#3-2›

              ( ) Link to a place where the local file will be stored                 ‹#3-3›

              ┌─────────────────────────────────────┐
              Relink all files in the project now     ‹#4›
              └─────────────────────────────────────┘

    ‹#1›: It allows you to make a mirror/an exact duplicate of a website, if enabled
    ‹#2›: If you want to offline browse a site, you should enable this option

    ‹#3-1›: The message will also contain a link directly to the Internet
    ‹#3-2›: Rewrite the link so that it is "externalized" to point back out to the Internet
    ‹#3-3›: "predictively" link to a blank location where that file will be stored,
              when the file is retrieved later

    ‹#4›: This button can be used to immediately rewrite the links
            for all HTML files in the project folder,
            using the current linkage system specifications


  ※ 'MetaProducts' Offline Explorer ‹OE› Standard/Pro/Enterprise :  ‹ http://www.metaproducts.com/

    ■ Advanced: Link Conversion

      ◊ Select link translation mode: Change links in downloaded files

        ( ) Offline Conversion
            All links in downloaded files point to offline locations
            regardless of whether they were downloaded or not

        (•) On-line Conversion
            Only the links that are allowed for download will point to offline locations
            Restricted links (because of insufficient Level setting, URL Filters limitations, etc.)
            will point to their original online documents and files

        ( ) No Conversion
            Create a mirror copy of a Website with all original links
            (Warning: It might be impossible to browse a Project offline
            if it was downloaded using this mode)

        ( ) Mixed
            Keep all restricted links intact. All downloaded links will point offline


    This section allows you to define how the HTML links should be converted

      Unless the No Conversion mode is selected,
      Offline Explorer Enterprise will translate every link
      that should be loaded to enable offline browsing when the download is complete

    Offline Conversion
      Every link that is not loaded will be translated to the offline location
      This prevents dialup dialog boxes from popping up when browsing offline

    On-line Conversion
      Change every link that should be not loaded to the online (Web) location
      This may cause dialup dialog boxes to appear when browsing sites offline

    No Conversion
      Check only if you want to get exact copies of downloaded HTML pages
      Offline Explorer Enterprise will not translate any links in these files
      if this setting is checked
      You will be unable to browse downloaded Web sites offline

    Mixed
      Keep every link that should be not loaded intact
      All links that should be downloaded will point to offline locations


I wished WebCopy : [√] Remap references within downloaded files

  was like
  Teleport : Links for unretrieved files : (•) Link to the Internet address for the file
  or
  OE : Link Conversion : (•) On-line Conversion

  but unfortunately, it is like
  OE : Link Conversion : (•) Mixed