Results 1 to 4 of 4

Thread: Kill hanging programs using python

  1. #1
    Join Date
    Aug 2007
    Location
    Bangkok, TH
    Beans
    410

    Kill hanging programs using python

    Hi, I'm a python scripter (level: noob). As a practice, I've written a script to monitor the CPU usage of selected programs and restart them if suspected of hanging.

    I've also included a simple configuration tool to get started.
    Warning: I don't check for bad input from the user. For process, enter the name of the process (e.g. rhythmbox). For threshold, enter the CPU usage which the process may not exceed. (e.g. 50) It means, if rhythmbox uses more than 50% of CPU, it will be restarted.

    I'm looking for comments about my code. Maybe efficiency or portability. Anything helpful. I'm sure a sh script will be faster , but I'm not looking to learn that yet.

    file: killer.py
    Code:
    #!/usr/bin/env python
    """Kills and restarts selected processes if they use CPU above their limit"""
    ## Version 0.1
    
    ##################################################
    ## Author: Lian
    ##### Python n00b
    ##
    ## Date: Aug 12, 2009
    ## Email: lian{dot}hui{dot}lui{at}gmail{dot}com
    ##################################################
    
    import time # for sleep
    import os   # for popen
    import subprocess # for calling a subprocess
    
    ## read the config file
    listfile = "killerlist" # the list of processes are stored here 
    
    f = open(listfile,'r')
    print "Loading list.."
    file = f.read()
    
    processes = file.split('\n') # processes are separated by a newline 
    
    pname = []
    limit = []
    count = []
    
    sleeptime = 2 # in seconds # time to sleep between each query
    tolerate = 10 # number of times to tolerate CPU usage above limit before killing
    ## for example with sleeptime = 2, tolerate = 10, a hanging process will die after 20 seconds
    
    ## get the process name and its limit
    for i in processes:
        t = i.split('\t')
        if( len(t) == 2):
            print "found: ",i
            pname.append(t[0])
            limit.append(t[1])
            count.append(0)
    
    print "done" # done loading list
    
    while(True): # this script never ends.. unless an error occurs
        for i in range(0,len(pname)): # loop for each process
            pid = os.popen("pgrep "+pname[i]+" | tail -n 1").readline().strip()
            if pid == "":
                print pname[i],"is not running"
                break
    
            ## two different ways to get the CPU usage, which one is better?
            # cpu = os.popen('ps -o "%C" -p '+pid+' | tail -n 1').readline().strip()
            cpu = os.popen('top -b -n 1 -p '+pid).read().split('\n')[7][41:45].strip() # get the cpu column
    
            print pname[i], # print process name
            print pid, # print pid
            print "(cpu",cpu,"with limit",limit[i]+")",
            if(cpu == ""):
                print "couldn't get cpu, skip once"
                break
            over = (float(cpu)>float(limit[i]))
            print over, count[i]
            if(over):
                if(count[i] > tolerate):
                    subprocess.call("killall "+pname[i]+"; "+pname[i]+" > /dev/null &", shell=True)
                else:
                    count[i] = count[i]+1
            else:
                count[i] = 0 # reset count
    
        time.sleep(sleeptime)
    
    ## EOF
    file: killerconf.py
    Code:
    #!/usr/bin/env python
    """A crude configuration tool for killer.py"""
    ## Version 0.1
    
    ##################################################
    ## Author: Lian
    ##### Python n00b
    ##
    ## Date: Aug 12, 2009
    ## Email: lian{dot}hui{dot}lui{at}gmail{dot}com
    ##################################################
    
    listfile = "killerlist" # filename of list
    
    print "Welcome to killer config"
    
    ## Menu
    def menu():
        print "1) List all"
        print "2) Clear all"
        print "3) Add"
        print "0) Exit"
    
    choice = -1
    
    while(choice != 0):
        menu()
        choice = int(raw_input("Enter choice: "))
        if(choice == 1):
            f = open(listfile, 'r')
            print f.read()
        elif(choice == 2):
            f = open(listfile, 'w')
        elif(choice == 3):
            p = str(raw_input("Enter process: "))
            t = str(raw_input("Enter threshold: "))
            f = open(listfile, 'a')
            f.write(p+"\t"+t+"\n")
        elif(choice == 0):
            print "Exiting"
        else:
            print "Bad choice"

  2. #2
    Join Date
    Jan 2006
    Beans
    Hidden!
    Distro
    Ubuntu 10.10 Maverick Meerkat

    Re: Kill hanging programs using python

    how do you decide if a process is hung? I cannot tell from the code. also, ps way is better (IMO).

    CPU usage of 100% by a process is a completely useless metric.
    I am infallible, you should know that by now.
    "My favorite language is call STAR. It's extremely concise. It has exactly one verb '*', which does exactly what I want at the moment." --Larry Wall
    (02:15:31 PM) ***TimToady and snake oil go way back...
    42 lines of Perl - SHI - Home Site

  3. #3
    Join Date
    Apr 2007
    Beans
    Hidden!
    Distro
    Hardy Heron (Ubuntu Development)

    Re: Kill hanging programs using python

    The appropriate condition IMHO to declare a process hung is to use the stack trace as the condition in your code (see the pstack example in Linux at http://developers.sun.com/solaris/ar...linux_app.html).

    If the stack trace is identical for those n samples, and n x time between samples is reasonably large, that would be a strong indicator that the process is hung.
    Please mark threads as SOLVED ( if solved ) so other people can find solutions faster.

  4. #4
    Join Date
    Aug 2007
    Location
    Bangkok, TH
    Beans
    410

    Re: Kill hanging programs using python

    Quote Originally Posted by slavik View Post
    how do you decide if a process is hung? I cannot tell from the code. also, ps way is better (IMO).

    CPU usage of 100% by a process is a completely useless metric.
    In the file "killerlist", there is the process name and the limit. This limit decides whether a process is hanging. In my case, 50 is a good limit for rhythmbox. Since, normal usage takes only 10%, and hanging takes 100% (or above!).

    I used ps, but it doesn't show the actual usage at the current time. Here's an extract from man ps:
    Code:
           %cpu      %CPU   cpu utilization of the process in "##.#" format.
                            Currently, it is the CPU time used divided by the time the
                            process has been running (cputime/realtime ratio),
                            expressed as a percentage. It will not add up to 100%
                            unless you are lucky. (alias pcpu).
    PaulFr, thanks for the info. I've installed pstack. Now, I'm just waiting for a hanging process.

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •