Thread: Making unique integer ID# from a string.

  1. #1
    Astrophysics student Ayreon's Avatar
    Join Date
    Mar 2009
    Location
    Netherlands
    Posts
    79

    Making unique integer ID# from a string.

    I've got a bunch of text files, with unique names. I want to make an array in my program of which each cell represents one of those filenames.
    For instance
    array[file_ID] holds information about one of the tekstfiles.

    How can I turn a tekstfile name which is a string, into a unique file_ID number?
    Eventually i'm going to use this array as an array of counters. Each file needs its own counter.
    Nothing to see here, move along...

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Start at 0 (array indices start at 0). Every time you get a new name, add one.

  3. #3
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    If the files already had such a number, eg, the files are named "123.txt", "4.txt", "6667.txt", I would say this idea is okay. If you are just going to assign arbitrary numbers to them for the sake of the program, I would say it is a bad idea, the reason being that you will need to look up the name somehow anyway (so why not just use the name in the first place? If you have to do a lookup to get the unique_id based on the filename in order to find out this "counter" (another number?) for that file, you will need two arrays...and you will have to search one of them for a value).

    This is probably a good place to use a linked list, which if you haven't used one, now is the time to learn (unless you don't intend to ever write anything but very simplistic C programs, and you may not be able to do what you want that way in the end*). Then you could have the information stored in a struct, and search the list using the actual filename.

    It is unclear what you mean by counter in this context.

    * Sorry to break that to you but it's true; you may need to do a few days of work learning about linked lists before you proceed. Or keep hacking away and wind up in the same place. Trust me You may not need an actual linked list, but you will at least need an array of pointers to a more complex datatype, either a string or a struct. Start reading and you'll figure it out.
    Last edited by MK27; 03-08-2009 at 12:34 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #4
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    What about using a key and encrypting the filename?

    The ASCII chart comes to mind, using two digits for every letter or number in the filename.

    The numbers you choose could be different than ASCII for a very mild type of security or convenience.

  5. #5
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Adak View Post
    What about using a key and encrypting the filename?

    The ASCII chart comes to mind, using two digits for every letter or number in the filename.

    The numbers you choose could be different than ASCII for a very mild type of security or convenience.
    Neat idea, but since you probably want at least the first three characters in order to distinguish one file from another, you will have to assign an array of 262626 elements (no upper or lower case). Straight ASCII values would be 127127127 elements (so Abcsomething.txt would be array[41098099], and abcsomething.txt array[97098099]).

    Beyond three letters, think just plain stupid, unless you use a more complex formula to compress and encrypt.
    Last edited by MK27; 03-08-2009 at 12:45 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by Adak View Post
    What about using a key and encrypting the filename?

    The ASCII chart comes to mind, using two digits for every letter or number in the filename.

    The numbers you choose could be different than ASCII for a very mild type of security or convenience.
    What you're describing is a hash function, albeit an extremely expensive one. I'd stay away from cryptographic hashes and use something simpler, but the idea of hashing is probably the way to go.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  7. #7
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by brewbuck View Post
    What you're describing is a hash function, albeit an extremely expensive one. I'd stay away from cryptographic hashes and use something simpler, but the idea of hashing is probably the way to go.
    In which case w/r/t our astrophysics student (the OP) perhaps C is not the appropriate language to be learning.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  8. #8
    Astrophysics student Ayreon's Avatar
    Join Date
    Mar 2009
    Location
    Netherlands
    Posts
    79
    I thought of a way around this problem now, so I don't need it anymore.
    I don't want to make it too complicated because i'm the only one that's using the program to analyze some data, and I shouldn't spend too much time on the program itself.
    This can be one for when I get enough spare time .
    Nothing to see here, move along...

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Ayreon View Post
    This can be one for when I get enough spare time .
    Kind of how I might feel about astrophysics

    Keep in mind about what I was trying to saying about learning the language -- there are probably better choices than C if you just need to write basic utility programs for your own purposes.
    Last edited by MK27; 03-08-2009 at 01:28 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  10. #10
    Astrophysics student Ayreon's Avatar
    Join Date
    Mar 2009
    Location
    Netherlands
    Posts
    79
    Within our education there is a programming course which everyone must follow. They used C, and that's why I'm using C, it's the only language I know anything about and I hear it is a good thing to understand C. Learning other languages is apparently easier with a C background (that's what I hear people say). Anyway, with help from this forum I'm actually starting to enjoy programming a little , so for the time being I'm gonna stick with C.
    Nothing to see here, move along...

  11. #11
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Ayreon View Post
    Within our education there is a programming course which everyone must follow. They used C, and that's why I'm using C, it's the only language I know anything about and I hear it is a good thing to understand C. Learning other languages is apparently easier with a C background (that's what I hear people say). Anyway, with help from this forum I'm actually starting to enjoy programming a little , so for the time being I'm gonna stick with C.
    Won't knock you for that! That a computer language will be easier to understand if you already know another one is something I would assume is generally true, regardless of whether C is one of them or on what side of the equation it is. So I'd thought I'd mention it since this could be a real "speak now or hold your peace moment" for you. The only other language I'm competent with is perl, which I still use when I think it will do what I want, well enough -- eg. perl has hashes, so
    Quote Originally Posted by perl
    my %files;
    $files{Abc.txt}="anything 17";
    $files{xyz.jpg}=789+x;
    which is a few more lines in C
    Last edited by MK27; 03-08-2009 at 02:50 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  12. #12
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by brewbuck View Post
    What you're describing is a hash function, albeit an extremely expensive one. I'd stay away from cryptographic hashes and use something simpler, but the idea of hashing is probably the way to go.
    No, I was thinking of a simple look up table - very fast:

    e.g.:
    A = 22, B = 38, C = 99, D = unique N, etc.

    @MK, since all letters "map" or correspond to 2 digit numbers, there is no need for what you describe. Of course, the length of the numbers would be twice the length of the filename char's, originally.

  13. #13
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Adak View Post
    @MK, since all letters "map" or correspond to 2 digit numbers, there is no need for what you describe. Of course, the length of the numbers would be twice the length of the filename char's, originally.
    Adak, it will be exactly what I describe because you cannot just guess the first letter and subtract 42 or something. Using ASCII values you would need three digits for each letter, so the highest real value, if "zzz", would be array[122122122].
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  14. #14
    Astrophysics student Ayreon's Avatar
    Join Date
    Mar 2009
    Location
    Netherlands
    Posts
    79
    Quote Originally Posted by MK27 View Post
    this could be a real "speak now or hold your peace moment" for you.
    I'm not sure I understand what you mean by that, I'm not marrying C . I am open to suggestions for using other languages(The one you just mentioned looks handy), but not right now, because I've already started with C and I wouldn't want to recreate my program in another language, which I would have to learn from scratch.

    But Isn't it true that most languages are based on C, and that that's why it is easier to learn other languages if you know C?
    Nothing to see here, move along...

  15. #15
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Ayreon View Post
    But Isn't it true that most languages are based on C, and that that's why it is easier to learn other languages if you know C?
    I think that most languages are based on "lower level" concerns, but C is more transparently close to those concerns than most other languages -- which creates this illusion.

    That's also why something like perl is easier to use for certain things -- it is much less concerned with low level transparency.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. String Class
    By BKurosawa in forum C++ Programming
    Replies: 117
    Last Post: 08-09-2007, 01:02 AM
  2. String issues
    By The_professor in forum C++ Programming
    Replies: 7
    Last Post: 06-12-2007, 09:11 AM
  3. Custom String class gives problem with another prog.
    By I BLcK I in forum C++ Programming
    Replies: 1
    Last Post: 12-18-2006, 03:40 AM
  4. Calculator + LinkedList
    By maro009 in forum C++ Programming
    Replies: 20
    Last Post: 05-17-2005, 12:56 PM
  5. Another overloading "<<" problem
    By alphaoide in forum C++ Programming
    Replies: 18
    Last Post: 09-30-2003, 10:32 AM