Thread: strspn problems with '#' character.

  1. #1
    Registered User
    Join Date
    Apr 2019
    Posts
    121

    strspn problems with '#' character.

    I have noticed a problem that I can't explain.

    When using `strspn`, it won't recognize the pound symbol unless it's the first character in the string.

    Create 2 files:
    Code:
    touch "#here" "there#"
    Compile this code:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main (int argc, char *argv[])
    {
        printf("Location: %lu\n", (argv[1], "#"));
        return(0);
    }
    When you run `program \#here` it lists the location as 1, which is correct. When you run `program there#` it says the position is 0, which is wrong.

    I'm using `strspn` to search for a bunch of special characters, and every other character seems to work with it, except the pound symbol. I understand I won't be able to change this behaviour, but why is it happening?

  2. #2
    Registered User
    Join Date
    Sep 2022
    Posts
    59
    You left out strspn in your example code, right?

    However, what shell do you use to call your program?

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,675
    > printf("Location: %lu\n", (argv[1], "#"));
    What!?

    Code:
    $ gcc -Wall bar.c
    bar.c: In function ‘main’:
    bar.c:6:39: warning: left-hand operand of comma expression has no effect [-Wunused-value]
        6 |     printf("Location: %lu\n", (argv[1], "#"));
          |                                       ^
    bar.c:6:25: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘char *’ [-Wformat=]
        6 |     printf("Location: %lu\n", (argv[1], "#"));
          |                       ~~^     ~~~~~~~~~~~~~~
          |                         |             |
          |                         |             char *
          |                         long unsigned int
          |                       %s
    > When you run `program \#here` it lists the location as 1, which is correct. When you run `program there#` it says the position is 0, which is wrong.
    Maybe read the manual page again.
    The strspn() function calculates the length (in bytes) of the initial segment of s which consists entirely of bytes in accept.
    As written, all you're doing is counting how many leading # there are in the string.
    The number and placement of # anywhere else in the string is irrelevant.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Sep 2022
    Posts
    59
    Oh, right. However, # is also a character with special meaning in shell languages. It introduces comments for example in bash or PowerShell.

  5. #5
    Registered User
    Join Date
    Apr 2019
    Posts
    121
    Quote Originally Posted by aGerman View Post
    You left out strspn in your example code, right?

    However, what shell do you use to call your program?
    Sorry, must have cut the function name instead of copying it.
    Code:
    #include <stdio.h>
    #include <string.h>
     
    int main (int argc, char *argv[])
    {
        printf("Location: %lu\n", strspn(argv[1], "#"));
    
        return(0);
    
    }
    I use Bash shell, and I am starting to think this may be the problem. I just don't know why it's happening with filenames.

  6. #6
    Registered User
    Join Date
    Dec 2017
    Posts
    1,664
    strcspn does what you want. The 'c' stands for compliment, and it counts how many leading characters of your string are not any of the given characters. It returns the length of the string if none of the given characters are in the string.
    Code:
        size_t n = strcspn(argv[1], "#");
        if (n < strlen(argv[1]))
            printf("First # is at position %zu\n", n);  // %zu is correct for size_t
        else
            printf("# is not in the string\n");
    You could also use strchr and subtract the original string start from the result (if not NULL).
    All truths are half-truths. - A.N. Whitehead

  7. #7
    Registered User
    Join Date
    Apr 2019
    Posts
    121
    Ok, this must be related to `strspn`:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main (int argc, char *argv[])
    {
        printf("Location: %lu\n", strspn(argv[1], "#"));
    
        for(size_t x = 0; x < strlen(argv[1]); x++)
            putchar(argv[1][x]);
    
        return(0);
    }
    The loop shows that the pound symbol is still a pound symbol inside the program. I guess I won't use `strspn`, I'll have to step through each individual character.

  8. #8
    Registered User
    Join Date
    Apr 2019
    Posts
    121
    Quote Originally Posted by john.c View Post
    strcspn does what you want. The 'c' stands for compliment, and it counts how many leading characters of your string are not any of the given characters. It returns the length of the string if none of the given characters are in the string.
    Code:
        size_t n = strcspn(argv[1], "#");
        if (n < strlen(argv[1]))
            printf("First # is at position %zu\n", n);  // %zu is correct for size_t
        else
            printf("# is not in the string\n");
    You could also use strchr and subtract the original string start from the result (if not NULL).
    Ty, I will look into these.

  9. #9
    Registered User
    Join Date
    Apr 2019
    Posts
    121
    Ok, `strcspn` doesn't work. `strcspn` does the exact opposite of my first problem. Yes, it solves the problem of the location with `there#`, but not `#here`. And `strchr` won't work without extra effort. Here is the full scope of what I am trying to do.

    I want to remove special characters from filenames. If it is marked for tv, the special character is encoded, to keep the filename safe for processing. It's nice to have an '!' displayed in a filename, without the filename having an '!'. And if it isn't for tv, the character is removed by replacing it with a '.'.
    Code:
    #define SPECIAL_CHARS      "':,\"?&!()$/#;_<>*%|^@"
    
    void fix_file (char *file, int tv_flag)
    {
    // Declare variables.
        char new_filename[FILENAME_SIZE] = {0};
    
    // Loop through all characters.
        for(size_t x = 0; x < strlen(file); x++)
    // If it is a special character.
            if(strspn(&file[x], SPECIAL_CHARS) == 1)
    // If it needs to be encoded.
                if(tv_flag == FLAG_YES)
                    add_character(file[x], new_filename);
                else
                {
    // Remove character.
                    new_filename[strlen(new_filename) + 1] = '\0';
                    new_filename[strlen(new_filename)] = '.';
                }
            else
            {
    // Keep the character.
                new_filename[strlen(new_filename) + 1] = '\0';
                new_filename[strlen(new_filename)] = file[x];
            }
    }
    If I go the `strchr` route, which would work, I would need a bunch of tests for all the special characters I'm searching for. I was hoping to use this function, because I liked the ability of skipping directory entires '.' and '..'
    Code:
    // Skip '.' and '..' entries.
            if(dir->d_name[strspn(dir->d_name, ".")] == '\0')
                continue;
    I guess I could make one test for '#' with `strchr` outside of the other search.

    Thanks for your input guys.

  10. #10
    Registered User
    Join Date
    Dec 2017
    Posts
    1,664
    What do you mean by
    with `there#`, but not `#here`
    with `#here` strcspn says the position is 0, so it works.

    EDIT:
    If file[x] is a single character (I don't know what the & is for), then you could use strchr backwards, strchr(SPECIAL_CHARS, file[x])

    If it returns NULL then file[x] is not a special char.

    You could also maybe look at strpbrk

    strpbrk - cppreference.com
    Last edited by john.c; 03-02-2024 at 01:11 PM.
    All truths are half-truths. - A.N. Whitehead

  11. #11
    Registered User
    Join Date
    Apr 2019
    Posts
    121
    Quote Originally Posted by john.c View Post
    What do you mean by with `#here` strcspn says the position is 0, so it works.
    Your right. My local copy of `man strcspn` does not list what errors, or what happens if it is not found. I read somewhere that `strspn` returns 0 when not found. But, I wrongly assumed that 0 in `strcspn` had the same meaning. Ty for clearing this up, this will work.

    Quote Originally Posted by john.c View Post
    If file[x] is a single character (I don't know what the & is for)
    The & is because both functions require a string to search. This is why I was testing for 1, it means the first character is special. If the character is not special, I have to copy the character as is.

    Quote Originally Posted by john.c View Post
    then you could use strchr backwards, strchr(SPECIAL_CHARS, file[x])
    That fixes that problem. Thanks.

  12. #12
    Registered User
    Join Date
    May 2012
    Location
    Arizona, USA
    Posts
    957
    Quote Originally Posted by Yonut View Post
    The & is because both functions require a string to search. This is why I was testing for 1, it means the first character is special. If the character is not special, I have to copy the character as is.
    But strspn returns 1 only when the first character is special and the next character is not special. So strspn("##foo", SPECIAL_CHARS) will return 2.

    strspn and strcspn are the wrong tools for this job, as you've found.

  13. #13
    Registered User
    Join Date
    Feb 2019
    Posts
    1,077
    strspn() and strcspn() counts how many chars of a set are or aren't in the original string since it's beginning:
    Code:
      size_t sz = strspn( "#!#abc", "#!" ); // will return 3 because '#' and '!' are the 3 first chars.
    strcspn() is complementary to this span count. It will count how many chars from the set aren't in the original string, since its beginning:
    Code:
      size_t sz = strcspn( "abc#!#", "#!" ); // will return 3 because 'a', 'b' and 'c' aren't '#' or '!'
    That's it. If you use:
    Code:
      size_t sz = strspn ( "abc#", "#!" ); // will return 0, because 'a' isn't '#' or '!'.

  14. #14
    Registered User
    Join Date
    Apr 2019
    Posts
    121
    Quote Originally Posted by christop View Post
    But strspn returns 1 only when the first character is special and the next character is not special. So strspn("##foo", SPECIAL_CHARS) will return 2.

    strspn and strcspn are the wrong tools for this job, as you've found.
    Actually, following the course of the conversation, you would notice that I changed my use of `strspn` to `strcsnp`, which does the following:

    1) If there are no special characters, it returns the length of the original string.
    2) If it does start with a special character, it will return 0.

    That seems to be what I need. Can you expand on your comment:
    Quote Originally Posted by christop View Post
    strspn and strcspn are the wrong tools for this job

  15. #15
    Registered User
    Join Date
    Apr 2019
    Posts
    121
    This is what I test to see if I need to change the filename:
    Code:
    if(strlen(filename) != strcspn(filename, SPECIAL_CHARS))
    And this is what I use while changing the filename.
    Code:
    for(size_t x = 0; x < strlen(filename); x++)
        if(strchr(SPECIAL_CHARS, filename[x]) != NULL)
    I'm going character by character. If the next 2 characters are special, that's great, but I'm still only dealing with the first character. The second one can wait till next round.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 4
    Last Post: 04-01-2016, 11:22 AM
  2. Problems reading in character
    By jjohan in forum C Programming
    Replies: 8
    Last Post: 09-11-2014, 01:45 AM
  3. Problems with character input
    By OmnipotentCow in forum C Programming
    Replies: 19
    Last Post: 06-20-2003, 03:39 PM
  4. strspn()
    By JDMac in forum C Programming
    Replies: 4
    Last Post: 11-04-2002, 04:01 PM
  5. int StrSpn(char *str, int ch);
    By Krush in forum C Programming
    Replies: 5
    Last Post: 11-01-2002, 07:13 PM

Tags for this Thread