Results 1 to 10 of 10

Thread: Python text file input Arrays and Dicts

  1. #1
    Join Date
    Dec 2009
    Location
    UK
    Beans
    128
    Distro
    Ubuntu 11.04 Natty Narwhal

    Lightbulb Python text file input Arrays and Dicts

    I have been doing some basic tasks (calulations, file read and write) using Python. I am now a bit stuck and getting confused.
    I have a series of files in a format like:
    Code:
     
    Fruit ={
    
    Name = "apple"; Price = 1; Weight = 3; Colour = "Red"; DrinkList ={
    Cider = 1; Cordial = 1; Slush = 0; };
    }; Fruit ={
    Name = "pear"; Price = 1.5; Weight = 6; Colour = "Yellow"; DrinkList ={
    Cider = 1; Cordial = 0; Slush = 0; };
    };
    I need to be able to read in the file and provide reports on the data, such like a list of fruit that can make cider (Cider = 1).
    Each time I read the file it may have a different number of fruits, extended data list within (i.e. some may have a drinklist some may have a piplist, some may have no extended data list).
    I am lost trying to work out the basic structure of my python code to separate out the fruits etc.
    Just trying to get some tips and trick really before i end up with pages of novice code!!!

    Many thanks

  2. #2
    Join Date
    Apr 2009
    Location
    Germany
    Beans
    2,134
    Distro
    Ubuntu Development Release

    Re: Python text file input Arrays and Dicts

    do you have to use this file format?
    is it some kind of standard format? I don't recognize it.

    If you can change the format, I recommend using a standard format like json, yaml, xml, rdf or which makes much sense for python: python dictionaries and lists.

    These formats all have modules to easily parse them into native python datastructures greatly simplifying the code.

    e.g. json:
    Code:
    In [1]: import json
    
    In [2]: json.loads("""[{"Name" : "apple", "Price": 1, "DrinkList" : [ "Slush", "Cider" ]}]""")
    Out[2]: [{u'DrinkList': [u'Slush', u'Cider'], u'Name': u'apple', u'Price': 1}]
    Last edited by MadCow108; August 1st, 2011 at 04:06 PM.

  3. #3
    Join Date
    Mar 2009
    Location
    Western Hemisphere.
    Beans
    136
    Distro
    Ubuntu

    Re: Python text file input Arrays and Dicts

    For what it's worth, that looks to me like an OpenStep .plist format - possibly you're working on something iPhone related?. I'm with MadCow108 on using json if you can, but if you can't, if you can get whatever program is generating the files to output them in the xml .plist format, then you can use plistlib to parse them easily.

    Writing a reliable parser yourself is a non-trivial undertaking, so unless you're doing this for the fun writing a parser!

    If you are writing a parser, and super high speed performance isn't critical, some sort of parser generator will make your life much easier. Pyparsing is simple and easy to use, though it doesn't have the sophisticated features of some of the bigger parser generators.

  4. #4
    Join Date
    Dec 2009
    Location
    UK
    Beans
    128
    Distro
    Ubuntu 11.04 Natty Narwhal

    Re: Python text file input Arrays and Dicts

    Thank you for your replies, but no I do not have any choice over the file format.
    It is created by an external system which will not export in any other format.

    By Parser do you mean just a collection of Python code that will read in all the values for me?
    Each 'fruit' in my example is enclosed in {} so I was going to start there?

    Bit stumped

  5. #5
    Join Date
    Mar 2009
    Location
    Western Hemisphere.
    Beans
    136
    Distro
    Ubuntu

    Re: Python text file input Arrays and Dicts

    Quote Originally Posted by beegary View Post
    By Parser do you mean just a collection of Python code that will read in all the values for me?
    In essence, yes =). There are thousands of computer languages out there, and generally, taking text in one of them and turning it into data in a natively useful format is parsing, and a program that does that is called a parser.

    Making parsers is such a common task that people have made libraries that generate parsers so that you don't have to attend to all the details manually.

    I would definitely use Pyparsing for this, but mostly because I'm familiar with it. If the files are in a consistent format, parsing it using basic string operations should be pretty easy, so take a look at pyparsing if you're curious, but don't let me confuse you!

  6. #6
    Join Date
    Jun 2007
    Location
    The Fiery Desert
    Beans
    139
    Distro
    Ubuntu 7.04 Feisty Fawn

    Re: Python text file input Arrays and Dicts

    I've had to parse some oddly formatted data files. The structures weren't as complicated, but much longer. So here are the things I've learned.

    Regular expressions are not hard to learn, and are super-awesome. The Python module re does everything you need. Here is a tutorial on regex.

    Boolean operator in is your best friend. It's a quick search operator that looks for one string in another, and it rocks when matching text:
    Code:
    >>> 'blah' in 'blahblahblah'
    True
    Work through the database by breaking it down into progressively smaller blocks. For example, you know that a line consisting of just '};' means an end of a section. So you can start with the list generated by open('filename').readlines(), and then break off each individual item.

    .strip(), .split(), and .join() are similarly very useful. In particular the first two when parsing text. For example:
    Code:
    >>> a = '     Colour = "Red";'
    >>> a.strip().split('=')[1].strip()
    '"Red";'
    >>> '"Red";'[1:-2]
    'Red'
    Of course these can be conveniently wrapped into a single line.

    The try command may also be useful. It will attempt to execute some commands, but if they fail (for example, if a particular field is absent from a given item), it won't crash the program.

    I hope some of this random advice is helpful !
    Die writing: I write crazy things
    PhD Wander: I write crazy things about my travels
    RedScout: Lenovo 3000 C100, 1.5GHz Celeron M, 1.25GB RAM, 120GB HDD

  7. #7
    Join Date
    Dec 2009
    Location
    UK
    Beans
    128
    Distro
    Ubuntu 11.04 Natty Narwhal

    Re: Python text file input Arrays and Dicts

    Thank you for the random tips.
    OK im going to give it a go!!

    I will post back here if I succeed. I mean when!

  8. #8
    Join Date
    Apr 2009
    Location
    Germany
    Beans
    2,134
    Distro
    Ubuntu Development Release

    Re: Python text file input Arrays and Dicts

    Doesn't the program which created these datafile provide a library for parsing them?
    If yes and its not written in python but in C, you could use it with python ctypes to save you some trouble.
    If not the program really sucks and your best shot is probably pyparsing (or a similar parsing library).
    Last edited by MadCow108; August 3rd, 2011 at 12:58 PM.

  9. #9
    Join Date
    Apr 2009
    Location
    Germany
    Beans
    2,134
    Distro
    Ubuntu Development Release

    Re: Python text file input Arrays and Dicts

    I by coincidence stumbled over a file which had a suspiciously similar format to yours. And it turns out it may very well be a pretty standard format: libconfig
    http://www.hyperrealm.com/libconfig/test.cfg.txt

    maybe you can find some python bindings for it, or at least use the native library with ctypes.

  10. #10
    Join Date
    Dec 2009
    Location
    UK
    Beans
    128
    Distro
    Ubuntu 11.04 Natty Narwhal

    Re: Python text file input Arrays and Dicts

    Thank you madcow that looks very similar. Ill let you know what I find

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •