shaggy999
November 26th, 2009, 05:19 AM
I have a simple bash script that I execute on every subdirectory of a given directory. This script churns for awhile on each directory and maxes out one of my four cores. This is how I run it:
find . -type d -exec scriptname '{}' \;
This is really inefficient because the script is cpu-bound and not io-bound. It would be really awesome to be able to run four copies of this script at once on each of my cores. Theoretically this should bring the total elapsed compute time to 1/4 of what it is now. Right now it takes about 8 hours to churn through all the data.
Does anybody know of any simple programs that would assist me with this or suggestions on how to write a bash script that is multi-core aware?
I've got a general idea for a script that would go something like this:
DIRS=`find . type d`
MAX_THREADS=4
THREADS=0
SLEEP_TIME=10
while ('$DIRS Contains data')
{
if (THREADS < MAX_THREADS)
{
Spawn thread with line from DIRS
THREADS++
}
else
{
sleep SLEEP_TIME
}
}
Obviously, this is all pseudocode. The biggest hurdle I'm trying to get past is figuring out how to spawn new processes and figure out how to get notified when a process finishes so I can decrement the thread count. Perhaps this is too much to ask of bash and I should move to python or something.
find . -type d -exec scriptname '{}' \;
This is really inefficient because the script is cpu-bound and not io-bound. It would be really awesome to be able to run four copies of this script at once on each of my cores. Theoretically this should bring the total elapsed compute time to 1/4 of what it is now. Right now it takes about 8 hours to churn through all the data.
Does anybody know of any simple programs that would assist me with this or suggestions on how to write a bash script that is multi-core aware?
I've got a general idea for a script that would go something like this:
DIRS=`find . type d`
MAX_THREADS=4
THREADS=0
SLEEP_TIME=10
while ('$DIRS Contains data')
{
if (THREADS < MAX_THREADS)
{
Spawn thread with line from DIRS
THREADS++
}
else
{
sleep SLEEP_TIME
}
}
Obviously, this is all pseudocode. The biggest hurdle I'm trying to get past is figuring out how to spawn new processes and figure out how to get notified when a process finishes so I can decrement the thread count. Perhaps this is too much to ask of bash and I should move to python or something.