Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: C hash of array, hash of hash, etc.

  1. #1
    Join Date
    Jan 2009
    Location
    Pennsylvania
    Beans
    113
    Distro
    Ubuntu 14.10 Utopic Unicorn

    Question C hash of array, hash of hash, etc.

    Hello,

    in perl I frequently store data in hashes of arrays, e.g. $data{beta2}[1701]. This allows me to do a lot with the data in very few lines of code.

    However, I am very curious about how to do this in C. For example, if I have

    Code:
    my %data;
    my @torsions = qw(alpha2 alpha3 alpha4 beta2 beta3 beta4 gamma1 gamma2 gamma3 gamma4 epsilon1 epsilon2 epsilon3 zeta1 zeta2 zeta3 chi1 chi2 chi3 chi4);
    foreach my $tor (@torsions) {
       open(FH,"<$tor") or die "cannot read $tor: $!";
       while (<FH>) {
          if (/\d+\.\d+\s+(\d+)\.(\d+)/) {
             push(@{ data{$tor}, "$1.$2");
          }
       }
       close FH;
    }
    The current version I have of this in C is about 200 lines long, tedious to write and work with the data, and is very difficult to read. How could I write this code in C?

    Thanks so much!

  2. #2
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,499
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: C hash of array, hash of hash, etc.

    maybe you describe what it does in plain english, not many people can read perl incantations.
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

  3. #3
    Join Date
    Jan 2009
    Location
    Pennsylvania
    Beans
    113
    Distro
    Ubuntu 14.10 Utopic Unicorn

    Question Re: C hash of array, hash of hash, etc.

    Hi Vaphell,

    The idea is to store all of the data from these files in one large hash, e.g. instead of manually creating bulky multidimensional arrays and reading data to each of 20 arrays separately:
    Code:
           double (*alpha)[3], (*beta)[3], (*gamma)[4], (*chi)[4], (*epsilon)[3], (*zeta)[3];
           alpha = malloc(length*sizeof(double)*3);
           beta = malloc(length*sizeof(double)*3);
           gamma = malloc(length*sizeof(double)*4);
           epsilon = malloc(length*sizeof(double)*3);
           chi = malloc(length*sizeof(double)*4);
           zeta = malloc(length*sizeof(double)*3);
           for (unsigned int nuc = 2; nuc <= 4; nuc++) {//read in torsions with 3 present
    //read in alpha
              sprintf(file,"%salpha%u",simulation,nuc);
              FILE *fh;
              fh = fopen(file,"r");
              if (fh != NULL) {
                 unsigned int c = 0;
                 while(!feof(fh)) {
                    if (fscanf(fh,"%f %lf",&time,&alpha[nuc-2][c]) != 2) {
                       continue;
                    }
                    if (alpha[nuc-2][c] < 0.0) {
                       alpha[nuc-2][c] += 360.0;
                    }
                 }
              } else {
                 printf("Could not read %s.\n",file);
                 exit(EXIT_FAILURE);
              }
    //read in beta
              sprintf(file,"%sbeta%u",simulation,nuc);
              fh = fopen(file,"r");
              if (fh != NULL) {
                 unsigned int c = 0;
                 while(!feof(fh)) {
                    if (fscanf(fh,"%f %lf",&time,&beta[nuc-2][c]) != 2) {
                       continue;
                    }
                    if (beta[nuc-2][c] < 0.0) {
                       beta[nuc-2][c] += 360.0;
                    }
                 }
              } else {
                 printf("Could not read %s.\n",file);
                 exit(EXIT_FAILURE);
              }
    //read in zeta
              sprintf(file,"%szeta%u",simulation,nuc-1);
              fh = fopen(file,"r");
              if (fh != NULL) {
                 unsigned int c = 0;
                 while(!feof(fh)) {
                    if (fscanf(fh,"%f %lf",&time,&zeta[nuc-1][c]) != 2) {
                       continue;
                    }
                    if (zeta[nuc-1][c] < 0.0) {
                       zeta[nuc-1][c] += 360.0;
                    }
                 }
              } else {
                 printf("Could not read %s.\n",file);
                 exit(EXIT_FAILURE);
              }
    //read in epsilon
              sprintf(file,"%sepsilon%u",simulation,nuc-1);
              fh = fopen(file,"r");
              if (fh != NULL) {
                 unsigned int c = 0;
                 while(!feof(fh)) {
                    if (fscanf(fh,"%f %lf",&time,&epsilon[nuc-1][c]) != 2) {
                       continue;
                    }
                    if (epsilon[nuc-1][c] < 0.0) {
                       epsilon[nuc-1][c] += 360.0;
                    }
                 }
              } else {
                 printf("Could not read %s.\n",file);
                 exit(EXIT_FAILURE);
              }
           }
           for (unsigned int nuc = 1; nuc <= 4; nuc++) {
    //read in gamma
              sprintf(file,"%sgamma%u",simulation,nuc);
              FILE *fh;
              fh = fopen(file,"r");
              if (fh != NULL) {
                 unsigned int c = 0;
                 while(!feof(fh)) {
                    if (fscanf(fh,"%f %lf",&time,&gamma[nuc-1][c]) != 2) {
                       continue;
                    }
                    if (gamma[nuc-1][c] < 0.0) {
                       gamma[nuc-1][c] += 360.0;
                    }
                 }
              } else {
                 printf("Could not read %s.\n",file);
                 exit(EXIT_FAILURE);
              }
    //read in chi
              sprintf(file,"%schi%u",simulation,nuc);
              fh = fopen(file,"r");
              if (fh != NULL) {
                 unsigned int c = 0;
                 while(!feof(fh)) {
                    if (fscanf(fh,"%f %lf",&time,&chi[nuc-1][c]) != 2) {
                       continue;
                    }
                    if (chi[nuc-1][c] < 0.0) {
                       chi[nuc-1][c] += 360.0;
                    }
                 }
              } else {
                 printf("Could not read %s.\n",file);
                 exit(EXIT_FAILURE);
              }
           }
    //free memory for each torsion stored in memory
           free(alpha); alpha = NULL;
           free(beta); beta = NULL;
           free(gamma); gamma = NULL;
           free(epsilon); epsilon = NULL;
           free(zeta); zeta = NULL;
           free(chi); chi = NULL;
    This code works just fine (this is a snippet from the complete program code). This is pretty cumbersome and lengthy, and I want other people to be able to easily read it in as few lines as possible in C, which more people know than perl, is faster, and is almost universally installed. Also, as you said, perl is often criticized for being difficult to read.

    In perl I can run this equivalent in 11 lines. How I could write this in a C hash/associative array/whatever in something like Perl's

    Code:
    push($data{$file},$variable)
    which can be accessed from

    Code:
    $data{$file}[$index]
    in C?

    Thanks for your time

  4. #4
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,499
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: C hash of array, hash of hash, etc.

    does it have to be C? In C++ you have STL with maps, vectors and whatnot.
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

  5. #5
    Join Date
    Nov 2005
    Location
    Sendai, Japan
    Beans
    11,296
    Distro
    Kubuntu

    Re: C hash of array, hash of hash, etc.

    Remark: pasting code from an existing program is almost never appropriate as an example.
    「明後日の夕方には帰ってるからね。」


  6. #6
    Join Date
    Aug 2011
    Location
    47°9′S 126°43W
    Beans
    2,172
    Distro
    Ubuntu 16.04 Xenial Xerus

    Re: C hash of array, hash of hash, etc.

    Quote Originally Posted by hailholyghost View Post
    Hi Vaphell,

    The idea is to store all of the data from these files in one large hash, e.g. instead of manually creating bulky multidimensional arrays and reading data to each of 20 arrays separately:

    This code works just fine (this is a snippet from the complete program code). This is pretty cumbersome and lengthy, and I want other people to be able to easily read it in as few lines as possible in C, which more people know than perl, is faster, and is almost universally installed. Also, as you said, perl is often criticized for being difficult to read.

    In perl I can run this equivalent in 11 lines. How I could write this in a C hash/associative array/whatever in something like Perl's

    Code:
    push($data{$file},$variable)
    which can be accessed from

    Code:
    $data{$file}[$index]
    in C?

    Thanks for your time
    Some thoughts:

    1. If C code that can run 10x faster than Perl code could be as terse, there would be no need for Perl
    2. I see an awful lot of duplication in the C code... The 6 pieces of code that read files could be one single function called 6 times.
    3. Is this code working? I don't see the "c" variables incremented (nor their values tested against "length")
    Warning: unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.

  7. #7
    Join Date
    Apr 2011
    Location
    Maryland
    Beans
    1,461
    Distro
    Kubuntu 12.04 Precise Pangolin

    Re: C hash of array, hash of hash, etc.

    Quote Originally Posted by Vaphell View Post
    maybe you describe what it does in plain english, not many people can read perl incantations.
    LOL @ incantations!

    The script the OP posted, starts with an array of filenames stored in '@torsions'. Then the user is opening each file in that array one at a time (referring to each one as "$tor") and reading it line by line. If the line in the file matches the regex (note the captures):

    Code:
    /\d+\.\d+\s+(\d+)\.(\d+)/
    Then the data is being stored in a hash of arrays with a structure like so (curly brace items indicate a hash (dict in python), and square brackets indicate an array ):

    Code:
    {
        'alpha2'  =>
            [
                 "$1,$2"
            ],
          'alpha3'  =>
            [
                "$1,$2"
            ],
        .
        .
        .
    }
    Note that there's a typo in the push, that should read:

    Code:
    push( @{$data{$tor}}, "$1,$2" );
    And, given that the user has asked to create a hash of arrays, I think that actually the '"$1,$2"' is not quite correct and those variables should not be quoted. As it stand, I think that just creates a one element array, unless there is overlap with a key name, which I don't see in the given array above.

    From a brief excursion with C (which I intend to get back to once the dust settles on a few other things I'm working on at the moment), I can say that this is not so easy in C and is probably one of the reasons people took to and really liked Perl when it came out.

  8. #8
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,499
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: C hash of array, hash of hash, etc.

    yup, C is largely about rolling out your own solutions to problems and that means a lot of boilerplate

    In case of C++ STL's map (associative array) and vector (array) should take care of the data structure needs and simplify memory management.
    C++ is in the same ballpark as C when it comes to performance. 11 lines are not going to happen but it should be manageable.
    http://en.wikipedia.org/wiki/Standard_Template_Library
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

  9. #9
    Join Date
    Feb 2009
    Beans
    1,469

    Re: C hash of array, hash of hash, etc.

    Here's how I might translate that original Perl snippet into C:
    Code:
    #define NUM_TORSIONS 20
    char *torsions[NUM_TORSIONS] = {"alpha2", "alpha3", "alpha4", "beta2",
    "beta3", "beta4", "gamma1", "gamma2", "gamma3", "gamma4", "epsilon1",
    "epsilon2", "epsilon3", "zeta1", "zeta2", "zeta3", "chi1", "chi2", "chi3",
    "chi4"};
    dict_t *data = dict_from_keys(torsions);
    for (int i = 0; i < NUM_TORSIONS; i++) {
    	FILE *fh;
    	(fh = fopen(torsions[i], "r"))
    		|| log_exit("Cannot read %s", torsions[i]);
    	while (!feof(fh)) {
    		double temp;
    		if (fscanf(fh, "%*f %lf", &temp) == 1) {
    			dict_set(torsions[i],
    				list_create_or_append(dict_get(torsions[i]), temp));
    		}
    	}
    	fclose(fh);
    }
    Definitions of dict_t (which is a typdef), dict_from_keys, log_exit, dict_set, dict_get, and list_create_or_append are left as an exercise for the reader, but you can see how closely it matches the Perl and where I had to get a little verbose to do something in C that Perl does for you. Obviously for your real program you'd need additional functions to support the other stuff you need to do.

    C++ would make this much easier, I'm sure. But Perl should be right speedy for I/O and text, so I would suggest just using Perl, if it's an option and you know how. If you really need C for performance in another part of the program, you could write a C library to do just that part, and then call it from Perl. Likely still less trouble than writing the whole thing in C.

    (I'm assuming you're partial to Perl, or already have parts of this program written in Perl, but this applies equally well to other languages. As usual, your application will determine the best solution.)

  10. #10
    Join Date
    Jan 2009
    Location
    Pennsylvania
    Beans
    113
    Distro
    Ubuntu 14.10 Utopic Unicorn

    Re: C hash of array, hash of hash, etc.

    Hi everyone,

    thanks very much for your very thoughtful replies.

    IMHO, and I'm not a very experienced programmer, generally I can do C equivalents of Perl's hash structures with identically indexed arrays.

    For example, if f(x,y) = z, then I can store all x values in a given array, where the nth index of each array matches one another.

    Perl really isn't appropriate for this program. The Perl equivalent took more than an hour to run. The C program makes a lot of running averages, which perl doesn't seem to be very good with.

    I will look into learning C++ though. These Standard Template Libraries look very useful!

    Much appreciated and a happy Thanksgiving!
    -DC

Page 1 of 2 12 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •