Like Tree1Likes
  • 1 Post By anduril462

sed get everything between two charcters

This is a discussion on sed get everything between two charcters within the Linux Programming forums, part of the Platform Specific Boards category; hello I'm writing a bash script and I have the following line: word5 word1 (word3) `word2' word4 [string2] I want ...

  1. #1
    quo
    quo is offline
    Registered User
    Join Date
    May 2011
    Posts
    116

    sed get everything between two charcters

    hello

    I'm writing a bash script and I have the following line:

    word5 word1 (word3) `word2' word4 [string2]

    I want to save in a variable everything that is between []
    And then in a second variable everything that is between `'

    how can I do that?
    Thanks in advance

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,498
    Try something like (where foo.txt contains your line). If the line is in a variable, you could echo that and pipe it into sed.
    Code:
    sed 's/.*(\([a-zA-Z0-9_]*\)).*/\1/' < foo.txt
    1. That is sed's substitute command, to replace text using regular expressions. The stuff between the first two / characters is what is matched. The stuff between the second and third / is what the matched text is replaced with.
    2. Those match zero or more of any character. They are needed so sed will match all the other text on the line, and get rid of it in the replacement phase
    3. Those are literal parentheses, they are the parentheses surrounding word3 in your example
    4. Those backslash-parentheses create a subexpression, something that sed will match remember for the replacement part.
    5. That is a character set, it should contain all the possible characters in your word. In this case, I have upper and lower case letters, digits and underscore. Add more if you want. The * following it means "zero or more occurances", so you can have an empty string if you want.
    6. That is what all the matched text gets replaced by. The \1 refers to the first subexpression (everything inside the blue backslash-parentheses), i.e. everything inside the parentheses.


    You may need to tweak it if you want different characters in your word (change the purple text), and you will need to create a similar command to handle the `' word (the green stuff).

    Now, to get that into a bash variable, you need to use a bash subcommand.
    Code:
    var=$(some command)
    That simply substitutes the $(some command) with the output of some command. That output is stored in the shell variable var. You would use your sed commands in there.

    EDIT: I just realized you wanted the thing in [ ], not the thing in ( ). Oh well, consider it "an exercise for the reader". You may need to escape certain characters with a backslash.
    Salem likes this.

  3. #3
    quo
    quo is offline
    Registered User
    Join Date
    May 2011
    Posts
    116
    Quote Originally Posted by anduril462 View Post
    Try something like (where foo.txt contains your line). If the line is in a variable, you could echo that and pipe it into sed.
    Code:
    sed 's/.*(\([a-zA-Z0-9_]*\)).*/\1/' < foo.txt
    1. That is sed's substitute command, to replace text using regular expressions. The stuff between the first two / characters is what is matched. The stuff between the second and third / is what the matched text is replaced with.
    2. Those match zero or more of any character. They are needed so sed will match all the other text on the line, and get rid of it in the replacement phase
    3. Those are literal parentheses, they are the parentheses surrounding word3 in your example
    4. Those backslash-parentheses create a subexpression, something that sed will match remember for the replacement part.
    5. That is a character set, it should contain all the possible characters in your word. In this case, I have upper and lower case letters, digits and underscore. Add more if you want. The * following it means "zero or more occurances", so you can have an empty string if you want.
    6. That is what all the matched text gets replaced by. The \1 refers to the first subexpression (everything inside the blue backslash-parentheses), i.e. everything inside the parentheses.


    You may need to tweak it if you want different characters in your word (change the purple text), and you will need to create a similar command to handle the `' word (the green stuff).

    Now, to get that into a bash variable, you need to use a bash subcommand.
    Code:
    var=$(some command)
    That simply substitutes the $(some command) with the output of some command. That output is stored in the shell variable var. You would use your sed commands in there.

    EDIT: I just realized you wanted the thing in [ ], not the thing in ( ). Oh well, consider it "an exercise for the reader". You may need to escape certain characters with a backslash.

    Thank you so much for your answer.It was very helpful!
    For the [] I did:

    Code:
    sed 's/.*\[\([a-zA-Z0-9_]*\)\].*/\1/'
    and it worked just fine but for the `'
    I did:

    Code:
    sed 's/.*\`\([a-zA-Z0-9_]*\)\'.*/\1/'
    and it returned:

    Unmatched '.

    Why is that since I escaped the character?

  4. #4
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,498
    That has to do with how bash handles quoting. You didn't actually escape the character. Inside single quotes, virtually nothing has special meaning, and nothing can be escaped. Not even a single quote can be escaped, since the escape character (\) has no special meaning -- it's just a literal backslach. You need to get out of single quote mode temporarily. Read this: BASH: Single-quotes inside of single-quoted strings by Stuart Colville for an example.

  5. #5
    quo
    quo is offline
    Registered User
    Join Date
    May 2011
    Posts
    116
    Quote Originally Posted by anduril462 View Post
    That has to do with how bash handles quoting. You didn't actually escape the character. Inside single quotes, virtually nothing has special meaning, and nothing can be escaped. Not even a single quote can be escaped, since the escape character (\) has no special meaning -- it's just a literal backslach. You need to get out of single quote mode temporarily. Read this: BASH: Single-quotes inside of single-quoted strings by Stuart Colville for an example.
    thank you!solved

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. replacing charcters for other characters
    By terriuchiha in forum C++ Programming
    Replies: 2
    Last Post: 12-10-2010, 08:56 AM
  2. inserting charcters in a string
    By Calavera in forum C Programming
    Replies: 17
    Last Post: 10-10-2004, 09:40 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21