Thread: Is it C# itself, .NET, or my regex pattern that's a problem?

  1. #1
    Registered User Chris87's Avatar
    Join Date
    Dec 2007
    Posts
    139

    Is it C# itself, .NET, or my regex pattern that's a problem?

    After much frustration with not getting std::regex or boost::regex working under Windows, but working elsewhere, I decided to try my luck with C# in regards to getting my code to work.Basically I'm trying to parse in a list of instructions in a certain format for a game. My C# implementation doesn't work at all.

    Here's a sample of the input file:
    Code:
    LANGUAGE {English/ANSI}
    LEVEL6 {}
    ITEMTOT {124}
    INAM1    {dagger}
    IDES1    {a dagger}
    ISTT1    {15 10 0 1 3 1 0 0 0 1 0 0 0 0 0 0}
    IEFF1    {}
    And the code itself:
    Code:
    using System;
    using System.IO;
    using System.Text.RegularExpressions;
    
    namespace MyGame
    {
    	static class Program
    	{
    		static bool parseScript(string filename)
    		{
    			StreamReader sr = new StreamReader(filename);
    			Regex labelSyntax = new Regex(@"(\w+)([0-9]*)\s+\{(.*)\}");
    			
    			while (!sr.EndOfStream)
    			{
    				string line = sr.ReadLine();
    				MatchCollection m = labelSyntax.Matches(line);
    				Console.WriteLine("{0}", m.Count);
    				
    				if (m.Count > 0)
    				{
    					switch (m[0].Value)
    					{
    						case "LANGUAGE":
    							Console.WriteLine("Locale set to {0}", m[2].Value);
    							break;
    						default:
    							break;
    					}
    				}
    			}
    			
    			sr.Close();
    			return true;
    		}
    		
    		static int Main(string[] args)
    		{
    			parseScript(args[0]);
    			return 0;
    		}
    	}
    }
    At this point I'm wondering if it's simply invalid regex. I'm trying to capture the label, the number at the end separately (if any), and what's between the curly braces, if anything.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    My thinking is the dot operator consumes the trailing curly brace. dot is very hard to use - a lot of the time you will match something poorly with it. You can improve the match with an inverted set, i.e. match anything that isn't curly brace, then match curly brace.
    Code:
    @"(\w+)([0-9]*)\s+\{([^\}]*)\}"

  3. #3
    Registered User Chris87's Avatar
    Join Date
    Dec 2007
    Posts
    139
    Success! Thank you!!! Wow, that plagued me for quite some time

  4. #4
    Registered User
    Join Date
    May 2003
    Posts
    1,619
    You'll have the same kind of issue with the \w and the [0-9]*. Regex operators are greedy, so your \w+ is going to consume the number at the end of the string as well, and the [0-9]* will match zero times.
    You ever try a pink golf ball, Wally? Why, the wind shear on a pink ball alone can take the head clean off a 90 pound midget at 300 yards.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++11 Regex Problem
    By milli-961227 in forum C++ Programming
    Replies: 0
    Last Post: 11-16-2014, 04:36 AM
  2. Regex Pattern Question
    By surefire in forum C Programming
    Replies: 3
    Last Post: 11-18-2009, 12:48 PM
  3. Problem with regex expression
    By pippo in forum C++ Programming
    Replies: 7
    Last Post: 01-19-2009, 01:14 AM
  4. Regex problem
    By black_spot1984 in forum C++ Programming
    Replies: 2
    Last Post: 11-03-2008, 02:42 AM
  5. RegEX pattern problems. Help!
    By Iyouboushi in forum C# Programming
    Replies: 4
    Last Post: 01-31-2008, 07:31 AM

Tags for this Thread