Page 2 of 2 FirstFirst 12
Results 11 to 12 of 12

Thread: [Python] High Memory Usage

  1. #11
    Join Date
    Dec 2004
    Location
    Manchester
    Beans
    2,086
    Distro
    Ubuntu Mate 15.10 Wily Werewolf

    Re: [Python] High Memory Usage

    Quote Originally Posted by AureiAnimus View Post
    Also, a search for [python garbage collection] might give you more understanding about your problem.

    I recently had a problem with python memory usage, which came down more or less to this (semi-pseudo-code):

    Code:
    for file in directory:
        text = open(file).read()
        do something with text
    Now, if you're not familiar with lower level languages and the python garbage collector (like I was), you might think: well, the text variable is overwritten each time, so this can't possibly take that much more space than my largest file. Except: it's not really overwritten. So if this if loops continues, it will take the memory of the sum of all the files in the directory.

    This can be avoided by placing the inner part of the for loop in a function and only passing that function the values it needs.
    Code:
    def read_and_do_something(file):
        text = open(file).read()
        do something with text
    for file in directory:
        read_and_do_something(file)
    this way the text variable (and all others, except file) are deleted when the function ends because they "run out of scope" which means that there's no way they could be accessed.

    Hope this helps.
    did that actually help?

    when you do
    Code:
    text = open(file).read()
    that will create a string and attach the name 'text' to it. when you reassign 'text' to point at a new string then the old string still exists, but has not references pointing to it. the next time the garbage collector runs the memory can be freed.

    by moving it into a function now you have 'text' going out of scope and hence no more references pointing to the old string. it still needs to wait for the garbage collector to find it.

    i wonder if its more of an allocator issue. if you do
    Code:
    text = open("1.txt").read()
    text = open("2.txt").read()
    then the memory to hold 2.txt is allocated before 'text' is reassigned. so at some point both strings must be held in memory.

    so may this would work, and save you the trouble of having to restructure the code
    Code:
    for file in directory:
        text = open(file).read()
        do something with text
        del a
    note that del does not free the memory, just removes the reference. but if you are lucky it might have been freed by the time you get back around to the start of the loop.

    also if you could rewrite the code as
    Code:
    for file in directory:
        for line in open(file):
             do something with line
    then you only need to hold 1 line of text at a time.

  2. #12
    Join Date
    Sep 2009
    Location
    Canada, Montreal QC
    Beans
    1,809
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: [Python] High Memory Usage

    ssam, I have few to no case similar to what you described to me. I do all the file manipulation through my text control, and I rarely read a file by myself.

    lavinog, I tried disabling all the plugins, but that made only a 1-2 MB difference.
    Maybe I should read more about python memory management.
    Last edited by cgroza; August 15th, 2011 at 03:31 PM.
    I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones.
    Freedom is measured in Stallmans.
    Projects: gEcrit

Page 2 of 2 FirstFirst 12

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •