Results 1 to 4 of 4

Thread: Capture data from web page ?

  1. #1
    Join Date
    Aug 2013

    Capture data from web page ?

    There is a web page I need to extract streaming data from.
    It is not video or music but numerical values.

    I can do it with LibreCalc but only stagnant data, not live data.
    The labels load but not the numerical data I need.
    Example: Total...9999999, all I get is Total: no numerical data.
    I would like to capture this data every half hour.

    How can this be solved?


  2. #2
    Join Date
    Nov 2008
    Metro Boston
    Kubuntu Development Release

    Re: Capture data from web page ?

    I would write a script in bash or PHP and run it from cron every half hour.

    Unless the page contains only the data you need, you'll have to write some code to strip off any HTML cruft and just grab the data. Are they in an HTML table? Then you'd have to parse the <tr><td> structures as well.
    If you ask for help, do not abandon your request. Please have the courtesy to check for responses and thank the people who helped you.

    Blog · Linode System Administration Guides · Android Apps for Ubuntu Users

  3. #3
    Join Date
    Oct 2009
    Reykjavík, Ísland
    Lubuntu 17.10 Artful Aardvark

    Re: Capture data from web page ?

    Also worth trying wget.
    If you install Buntu 17.10 remember to download a new ISO file.

    Old files might contain a bug which can damage UEFI hardware. Updating an existing installation and upgrading to 17.10 (if one has faith in upgrades in general) are safe.

  4. #4
    Join Date
    Feb 2007
    West Hills CA
    Ubuntu 14.04 Trusty Tahr

    Re: Capture data from web page ?

    Or curl, but if the data is from a javascript calculator, you might have a difficult time scraping it because that process is jailed to prevent grabing of data for security purposes. If you print-to-file, then you have a PDF that you can scrape. So now you need to figure out how to send a print every half hour to /tmp and use your script to scrape the latest PDF file.

    If this is a public website, then there may be API's that you can use to get the same data. If it is bank account data (where you have to log in through a secure process) then you will have some difficulty.
    Oooh Shiny: PopularPages

    Unumquodque potest reparantur. Patientia sit virtus.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts