Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 28

Thread: Beginners programming challenge #27

  1. #11
    Join Date
    Jul 2010
    Beans
    96
    Distro
    Lubuntu

    Re: Beginners programming challenge #27

    Quote Originally Posted by Barrucadu View Post
    "." does not match newlines, so running .* on "hello\nworld" will have two matches: ["hello", "world"], and so "hello" should be printed.
    By the same reasoning, running "a*b" on "cbab" should return two matches: "b" and "ab", and so print "b". Is it right, or should I treat the newline as a special case?

    Sorry for the questions, but though I've used regexps I never really studied them.
    - What do you get if you multiply six by nine?
    - 42!

  2. #12
    Join Date
    May 2007
    Location
    East Yorkshire, England
    Beans
    Hidden!

    Re: Beginners programming challenge #27

    It's not a special case, and that's right, "b" is the first match there.
    Website | Blog | The Arch Hurd Project

    If you want to ask about something I posted, send a PM, as I don't watch many threads

  3. #13
    Join Date
    Jul 2010
    Beans
    96
    Distro
    Lubuntu

    Re: Beginners programming challenge #27

    Ok, thanks.

    I'm going to work on it when I'm done with my exams, it seems a nice one. Though I guess that in C it won't be pretty.
    - What do you get if you multiply six by nine?
    - 42!

  4. #14
    Join Date
    Sep 2009
    Location
    Canada, Montreal QC
    Beans
    1,809
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: Beginners programming challenge #27

    Quote Originally Posted by Odexios View Post
    Just to make sure I got the meaning of "it will print the text that was first matched by the regular expression".

    1) If the regex is ".*", should it always match anything and print nothing?

    2) If the regex is ".*at", and the string in input is "Matt is at the gym", it should match "Mat", right?
    The other users gave you the right answers . I am sorry if this challenge is too difficult, but since I posted it early, no need to rush. I won't judge this for the next 2 weeks.
    Last edited by cgroza; July 10th, 2012 at 04:02 PM.
    I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones.
    Freedom is measured in Stallmans.
    Projects: gEcrit

  5. #15
    Join Date
    May 2007
    Location
    East Yorkshire, England
    Beans
    Hidden!

    Re: Beginners programming challenge #27

    Quote Originally Posted by cgroza View Post
    Yes, you are correct.
    Not "Matt is at"?
    Website | Blog | The Arch Hurd Project

    If you want to ask about something I posted, send a PM, as I don't watch many threads

  6. #16
    Join Date
    Sep 2009
    Location
    Canada, Montreal QC
    Beans
    1,809
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: Beginners programming challenge #27

    Quote Originally Posted by Odexios View Post
    Ok, thanks.

    I'm going to work on it when I'm done with my exams, it seems a nice one. Though I guess that in C it won't be pretty.
    You could explore other languages that allow functions to be a first class citizen. I would imagine that a feature like that would greatly simplify your task.
    I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones.
    Freedom is measured in Stallmans.
    Projects: gEcrit

  7. #17
    Join Date
    Sep 2009
    Location
    Canada, Montreal QC
    Beans
    1,809
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: Beginners programming challenge #27

    Quote Originally Posted by Barrucadu View Post
    Not "Matt is at"?
    I edited my post, check again.
    I know not with what weapons World War III will be fought, but World War IV will be fought with sticks and stones.
    Freedom is measured in Stallmans.
    Projects: gEcrit

  8. #18
    Join Date
    Nov 2010
    Beans
    271
    Distro
    Ubuntu 10.10 Maverick Meerkat

    Re: Beginners programming challenge #27

    What happened to this thread?

    There are some problems with this challenge.
    It asks only to match the first found match which arises some confusions.

    for e.g.
    if searched on 'sin' for 'n?', it should match at 's' or, even 'l?' should match, because it represents that 'n' or 'l' may or not be present.
    Is that expected?

  9. #19
    Join Date
    Feb 2009
    Beans
    1,469

    Re: Beginners programming challenge #27

    Yes, a regex engine should allow /n?/ to match any string, even the empty string.

    Assuming ? is greedy, that pattern will match 'n' in "nine" and '' in "line". (If it were non-greedy, it would never match anything but ''.)

  10. #20
    Join Date
    Nov 2010
    Beans
    271
    Distro
    Ubuntu 10.10 Maverick Meerkat

    Re: Beginners programming challenge #27

    This is my entry in C.
    It doesn't do much.
    Inputs can only be taken from the terminal.
    '.' '?' '+' and '^' '$' are implemented.
    escaping isnt supported.

    Only thing good I can tell about it is - its functional as a regex matching engine. Although not all asked are implemented, it does work, but has some bugs.

    You can take a look at the code here for easier reading.

    Code:
    /*
      fool_regx.c
      challenge #27 entry :: debd
      am literally laughing at me ...
      hope actual regex engines are never implemented like this.
      lots of ifs, 37 to be precise.
      $, ^, ?, ., and + are implemented. There are some bugs.
      escaping the metas aren't supported.
      ^ and $ if present anywhere except in the beggining and in the end respectively are treated as literals.
      Input is taken from commandline.
      Input terminates with a '\n'
      Output is in the form of the entire string given where matches are highlighted in red.
    */
    
    #include <stdio.h>
    #include <stdlib.h>
    
    int main(void)
    {
      int str[500], i=0, k=0, w=0, wstart=0, q=0, p=0, chance=-1, wrd[200];
      int c, dummy=0, literals=0, bseek=0, metafound=0, char_dol=0;
      int match_from=0, match_start=0, match_done=0, s_before_plus=0;
      int last_whitesp=0;
    
      /* Variable descriptions::
    
      str[] :: holds the string to search in
      wrd[] :: stores the regex
      i :: count of input characters for string + 2
      k :: count of input characters for regex + 2
      p :: loop variable for looping through the string
      q :: loop variable for looping through the regex
      wstart :: stores position of last space found.
      chance :: if there is a match on progress
      literals :: count of everything except ^ and $ in the regex
      bseek :: last spcae position in case first char of the regex is a '.'
      metafound :: if any of '.' '+' '?' is found in regex
      char_dol :: if ^ or $ is present in regex in the first / last position
      match_from :: where to start searching for a match when ^ or $ is present
      match-start :: marks the start of a match for highlighting purpose
      match_done :: if a match has been found
      s_before_plus :: character prior to a '+'
      last_whitesp :: last white space in the input string
      dummy :: a dummy
    
      wstart and bseek were used in the 
      word version of the program i.e. the words where the matches were found were printed.
      Not necesssary now.
      
      */
      
      printf("Search in: ");
      do
      {
        c = getchar();
        if(c == ' ') last_whitesp = i;
        str[i++] = c;
      } while(c != '\n' && i < 500);
    
      printf("Regex: ");
      do
      {
        w = getchar();
        wrd[k++] = w;
        if(w == '.' || w == '?' || w == '+') metafound = 1;
      } while(w != '\n' && k < 200);
    
      literals = k-2; /* exclude the last newline and -1 more because k is increased after the last k++*/
      
      if(wrd[0] == '^')
        {
          q=1;
          literals = k-3; /* apparantly */
          char_dol = 1;
        }
    
      if(wrd[k-2] == '$')
        {
          literals = k-3;
          if(metafound) match_from = last_whitesp+1;  /* Entire word could match, not only the last chars */
          else match_from = (i-2) - (literals);       /* try for the last 'literal' number of chars of the string */
          if(metafound && wrd[k-3] == '.') match_from = (i-2) - (literals);  /* if last char of regex is '.' no need to try entire word */
          char_dol = 1;
          /*literals = k-3;*/
        }
    
      /*if(match_from > 0)  this portion is no more needed
      { 
        bseek = match_from;
        while(str[bseek] != str[0])
        {
          if(str[bseek] == ' ')
            {
              wstart = bseek+1;
              break;
            }
          bseek--;
        }
      }*/
    
      for(p=match_from; p<i-1; p++) /* start loop to find match */
      {
        /*printf("p:q=%i:%i", p,q);*/
    
        if(wrd[q] != str[p] && wrd[q] != '.' && wrd[q] != '?' && wrd[q] != '+') /* handles unmatch, index control for '?' */
          {
            if(wrd[q+1] == '?')
            {
              if(!q) match_start = p; /* mark start of match if next one is a '?' and regex is at begining */
              if(literals > q)
                q++, p--;             /* increase q but dont let p increase so we can check next in string with next in regex */
              continue;
            }
            q=0;
            chance = -1;
            if(literals > 0) p=match_start; /* after a match fails midway set p to previous start of match */
            match_start++;                  /* so it could be tried from on the next char in string */
            continue;
          }
    
        if(wrd[q] == str[p] || wrd[q] == '.' || wrd[q] == '?' || wrd[q] == '+')   /* this block holds most of the match logic and index control */
          {
            if(!q) match_start = p;
            if(wrd[q] == '?') p--;  /* we are at a ?. We dont care about chars preceding ? and the ? itself */
                                    /* to check next in regex with present char at p, dont let p increase */
                                    /* p--, p++ keeps it at same position */
    
            if(wrd[q] == '.' && wrd[q+1] == '+')  /* piece of cake */
              p=i-2, match_done=1;                /* p is also used to mark the end of match */
    
            if(wrd[q] == str[p] && wrd[q+1] == '+')   /* if next is a '+' remember what preceeds it */
              s_before_plus = wrd[q];
    
            if(wrd[q] == '+')
            {
              if(p == i-2) match_done = 1;      /* if p is already at the end, match is found */
              if (str[p] == s_before_plus) q--; /* if there are same chars as s_before_plus in the string increase p as */
               else p--;                        /* long as an unmatch is found. so keep q at same pos. otherwise decrease */
                                                /* p so we can check next in string with next in regex */
              
            }
            if(q == literals) match_done = 1;   /* if q is already == literals say match done. true only if regex contains no meta. */
            if(wrd[q] != '+' && char_dol)
            {
              if(char_dol && q == literals+1) match_done = 1;   /*  we took literals one less when a ^ / $ is present. */
            }
            else
            {
              if(char_dol) {      /* when + is present we stop when a unmatch is found after '+' */
                if(s_before_plus != str[p]) match_done = 1; }
            }
            
            if(q <= literals)
            { q++;
              chance = 1;
            }
          }
    
        if(str[p] == ' ')
        {
          if(chance == -1)
            wstart = p+1;  /* was used previously */
        }
    
        if(wrd[0] == '^' && p == literals+1)
          break;
    
    
        if(match_done)
        {
          printf("Match found : ");
          
          /*while(wstart <= p)  this portion is the code to output ib word mode.
          {
            if(wstart >= match_start)
              printf("\E[31m%c\E[0m\017", str[wstart++]);
            else
            printf("%c", str[wstart++]);
            if(wstart > p)
              {
                dummy = wstart;
                while(str[dummy] != ' ' && str[dummy] != '\n')
                  printf("%c", str[dummy++]);
              }
          }*/
          wstart=0;   /* we nomore use wstart as a holder for last space position */
          while(wstart < i-1)
          {
            if(wstart >= match_start && wstart <= p)    /* color from match start up to latest p */
              printf("\E[31m%c\E[0m\017", str[wstart++]);
            else
              printf("%c", str[wstart++]);              /* else vanilla */
          }
          puts("");
          exit(0);
        }
      }
    
      printf("Match not found.\n");
      exit(1);
    }
    Here are some example outputs:
    Last edited by debd; August 26th, 2012 at 05:05 PM.

Page 2 of 3 FirstFirst 123 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •