Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: [SOLVED] xkcd downloader

  1. #1
    Join Date
    Mar 2008
    Beans
    68

    [SOLVED] xkcd downloader

    so I make a sh script to download all the xkcd webcomics, but it only downloaded by file name so I added a counter in front of the file name.
    I really want the format to be 1_name_morename.jpg but to get the number and file name I use
    Code:
    mv $name $i.$name
    if i use
    Code:
    mv $name $i_$name
    it thinks it is one variable name. Help, suggestions, comments anything.



    ORIGINAL CODE
    Code:
    #!/bin/sh
    
    cd /home/king3/Pictures/webcomics/XKCD/
    
    i=1;
    while [ $i -lt 450 ]
    do
    	wget http://xkcd.com/$i/
    	url=`grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f2`	
    	name=`grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f2 | cut -d\/ -f5`
            alttext=`grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f6`
    	wget -N $url
    
    	mv $name "$i"_"$name"+"$alttext"
    	rm index.html
    	i=`expr $i + 1`
    done
    Explanation of the grep | head | cut
    [CODE]# the grep return the entire <img src=" line

    # <img src="http://imgs.xkcd.com/comics/tree_cropped_(1).jpg" title="'Petit' being a reference to Le Petit Prince, which I only thought about halfway through the sketch" alt="Petit Trees (sketch)" /><br/> <h3>Image URL (for hotlinking/embedding): http://imgs.xkcd.com/comics/tree_cropped_(1).jpg</h3>

    #by adding the "| head -1" we get one line of html

    #<img src="http://imgs.xkcd.com/comics/tree_cropped_(1).jpg" title="'Petit' being a reference to Le Petit Prince, which I only thought about halfway through the sketch" alt="Petit Trees (sketch)" /><br/>

    #by adding the "cut -d\" -f2" we find the second instance of text seperated by two quote marks

    #http://imgs.xkcd.com/comics/tree_cropped_(1).jpg

    #the second cut filters to the image name

    #tree_cropped_(1).jpg[

    #to get the alt text "cut cut -d\" -f6" will probable work
    /CODE]
    Last edited by king vash; July 23rd, 2008 at 07:07 PM.

  2. #2
    Join Date
    Jan 2008
    Location
    Auckland, New Zealand
    Beans
    3,131
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: xkcd downloader

    You can use:
    Code:
    mv $s "$i"_"$s"
    And can you post the final script when you're done? That could come in handy.

  3. #3
    Join Date
    Aug 2006
    Beans
    198

    Re: xkcd downloader

    Do remember that the tagline in many xkcd comics is in the alt-text. How do you plan to package that with the images?

  4. #4
    Join Date
    Jul 2008
    Beans
    2

    Re: xkcd downloader

    if u use FF3 try OutWit Hub (its an addon)
    it automatically detects the next pages and you can download pictures in a bunch (i have just tried on xkcd.com and it really works!!!^^) or if u want there is also a scraper widget
    https://addons.mozilla.org/en-US/firefox/addon/7271 or www.outwit.com
    Last edited by blackaj; July 23rd, 2008 at 03:26 PM.

  5. #5
    Join Date
    Mar 2008
    Beans
    68

    Re: xkcd downloader

    thank you ad_267 I will try that and see if it works.

    Tuna-Fish - I would really like a tuna fish sandwich right now. I can use a different grep \ head \ cut for that it would be pretty easy to as alt text are between quotes


    blackaj - I have used an add-on similar to the one your describing but my real purpose is to make about ten of these that download all my favorite webcomics once a week, and having to do any work other than type something like "sh sh\UpdateComics" in console sounds to hard.

    next question


    you might notice that the variable "name" is similar to "url" but with a second cut command. this next command will spit out the name of the file
    Code:
    echo $url | cut -d\/ -f5
    but when I try to set the variable "name" equal to that it always fail
    Code:
    echo this doesn't work
    name=`$url | cut -d\/ -f5`
    echo $name
    echo neither does this
    name=$($url | cut -d\/ -f5)
    echo $name
    what is the proper formatting syntax to get the command substitution to work?
    Last edited by king vash; July 23rd, 2008 at 07:08 PM.

  6. #6
    Join Date
    Mar 2008
    Beans
    68

    Re: xkcd downloader

    Almost final code
    it is worth a cookie if someone can merge the two name= lines togethor
    Code:
    #!/bin/sh
    
    cd /home/king3/Pictures/webcomics/XKCD/
    
    i=1;
    while [ $i -lt 453 ]
    do
    	wget http://xkcd.com/$i/
    	url=$(grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f2)	
    	file=$(echo $url | cut -d\" -f2 | cut -d\/ -f5)
    	alttext=$(grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f6)	
    	name=${file%_(1).jpg}
    	name=${name%.jpg}.jpg
    	wget $url
    	mv $file "$i"_"[""$alttext""]"_$name
    	rm index.html
    	i=`expr $i + 1`
    done

  7. #7
    Join Date
    Jan 2006
    Location
    Philadelphia
    Beans
    4,076
    Distro
    Ubuntu 8.10 Intrepid Ibex

    Re: xkcd downloader

    a somewhat off-topic comment: thanks for reminding me to catch up on some xkcd

  8. #8
    Join Date
    Jul 2008
    Beans
    2

    Re: xkcd downloader

    king vash, it was just that there is nothing to type in the consol, just tell the program save the files in ur harddisk and browse till the end, you'll see it's better than programming(anyway it's my opinion)

  9. #9

    Re: xkcd downloader

    Quote Originally Posted by king vash View Post
    my real purpose is to make about ten of these that download all my favorite webcomics once a week, and having to do any work other than type something like "sh sh\UpdateComics" in console sounds to hard.
    I use dailystrips, it knows how to download 676 strips (count made just now in hardy), xkcd being one of them.

    The config file is especially easy, I call it once a day in cron and it makes a single html page with my daily strips (about ten of them).

    It also keeps archive of previously downloaded strips, setup convenient links, ...

    If you've been offline for an extensive period, it has a --date option so you can download the strips for that specific date. You just have to setup a loop to download the last ten days strips.

    I've been using it since many years and am very happy about it

  10. #10
    Join Date
    Apr 2007
    Location
    Aarhus, Denmark
    Beans
    Hidden!
    Distro
    Ubuntu 9.10 Karmic Koala

    Re: xkcd downloader

    Slightly improved:
    Code:
    #!/bin/sh
    
    cd /home/lobner/Desktop/XKCD/
    
    i=1;
    while [ $i -lt 623 ]
    do
    	wget http://xkcd.com/$i/
    	url=$(grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f2)	
    	file=$(echo $url | cut -d\" -f2 | cut -d\/ -f5)
    	ext=$(echo $file | cut -d\. -f2)
    	alttext=$(grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f6)
    	titletext=$(grep http://imgs.xkcd.com/comics/ index.html | head -1 | cut -d\" -f4)
    	wget $url
    	mv $file "$i"_"[""$alttext""] - [""$titletext""]".$ext
    	rm index.html
    	i=`expr $i + 1`
    done
    "Computer science is no more about computers than astronomy is about telescopes."
    - Edsger Dijkstra

Page 1 of 2 12 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •