Results 1 to 9 of 9

Thread: How to download a page including all links on the page

  1. #1
    Join Date
    Apr 2009
    Beans
    42

    How to download a page including all links on the page

    Hi all, I am wondering if there is a tool to download a page and its associated pages. Like a manual could have an outline with its chapters and subchapters having a link on the main page. So i am looking for a tool that can download the manual including the different chapters and sub-chapters in one instruction. I hope there is a tool for this.

    Thank you.

  2. #2
    Join Date
    Jun 2006
    Location
    $ pwd _
    Beans
    3,999
    Distro
    Ubuntu 12.10 Quantal Quetzal

    Re: How to download a page including all links on the page

    Code:
    wget -r http://www.website.com
    You can control how deep (default is 5) you want the pages downloaded using the -l (lowercase L) option. Suppose you only need first 2 levels of http://www.website.com, you would do:

    Code:
    wget -r -l2 http://www.website.com
    For more information see
    Code:
    man wget
    EDIT: There is also a GUI wrapper for wget. Google gwget.
    Last edited by kpkeerthi; May 30th, 2009 at 09:47 AM.

  3. #3
    Join Date
    Apr 2009
    Beans
    42

    Re: How to download a page including all links on the page

    Hi Kpkeerthi, thank you for your prompt response. I tried the command but because i am behind a proxy it gives me an error. How do i specify my user name and password when i issue the command. BTW, is there a tool (like download manager in windows) that i can use in ubuntu?

    407 Proxy Authentication Required
    2009-05-30 11:47:34 ERROR 407: Proxy Authentication Required.


    Thank you again for your help.

    Zelalem

  4. #4
    Join Date
    Jun 2006
    Location
    $ pwd _
    Beans
    3,999
    Distro
    Ubuntu 12.10 Quantal Quetzal

    Re: How to download a page including all links on the page

    Did you check the man page for proxy authentication?

  5. #5
    Join Date
    Apr 2009
    Beans
    42

    Re: How to download a page including all links on the page

    No, in fact the following is what I got:

    Resolving wwwproxy.xx.xx.xx... 145.xxx.xxx.xxx, 145.xxx.xxx.xxx, 145.xxx.xxx.xxx, ...
    Connecting to wwwproxy.xxx.xxx.xx|145.xxx.xxx.xxx|:3000... connected.
    Proxy request sent, awaiting response... 407 Proxy Authentication Required
    2009-05-30 11:47:34 ERROR 407: Proxy Authentication Required.

    I changed my proxy name and address above. So, I wanted to know how I can insert my username and password in the request line so that the proxy can authenticate me to get the page.

    Thanks again.

    BTW, how about a tool (with graphical user interface). Don't you know any tool that does this work?

  6. #6
    Join Date
    Jun 2006
    Location
    $ pwd _
    Beans
    3,999
    Distro
    Ubuntu 12.10 Quantal Quetzal

    Re: How to download a page including all links on the page

    Code:
    --proxy-user=user
    --proxy-password=password
        Specify the username user and password password for authentication on a proxy server.
    Code:
    man wget

  7. #7
    Join Date
    Apr 2009
    Beans
    100

    Re: How to download a page including all links on the page

    ucan download a firefox addon called flashgot.it does everything u wanted.

  8. #8
    Join Date
    Apr 2009
    Beans
    42

    Re: How to download a page including all links on the page

    Hi Bobin, that is what I was looking for. A tool that I can use like download manager in windows. I will download it and try it. Thank you for your help.

    Thank you Kpkeerthi as well for your help.

    Cheers,

  9. #9
    Join Date
    Nov 2008
    Beans
    74

    Re: How to download a page including all links on the page

    Is this what you are looking for?

    http://www.httrack.com/

    The package is in the repositories and can be installed through Synaptic or Add/Remove.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •