January 7th, 2010, 08:59 AM
I need to write a script in python (2.5) to read text files with fixed width columns. Since the number of columns and there widths are not necessarily known in advance, I need to write it to interpret the column widths. The way I want to do it is to read through each line looking for where the word breaks are, and use the most common breaks to calculate the column widths. The problem I'm having is that I can't figure out how to write a regular expression to find the positions of the word breaks. Since some of the columns might be right-aligned, I want to find the position of the start and end of each word. I'm having a bit of trouble with the regex itself, but I also can't seem to get it to find each occurence - It only finds the first. If I use re.findall it only returns a list of matches, not their positions. Can anyone give me some advice on this?