Hi Everyone,
I know I'm not supposed to post assignments and expect people to do them, and that's not what I'm looking for I was just hoping that someone could outline what I need to do to approach this assignment. I've never done any programming, before, attempting HTML was as far as I got, in fact I gave that up cause I found it confusing. Anyways, I hope some's able/willing to help me outline what I need to do...
The assignment is as follows, and I know I need to use Arrays, if statements, scanf, printf, loops. However, I don't know how to make things like the tab, space and return key get ignored, or how to make the X key do what the return key would normally do. I really could use a basic outline of what I need to do to get this assignment done.
~~~~~
Outline:
In this assignment you will be writing a program to detect genes in prokaryotes. In working on this assignment, you will be using loops and arrays.
DNA (Deoxyribonucleic acid) is a class of molecules that contain instructions for the construction of proteins. These proteins, in turn, are essential parts of all living organismsm and participate in every process within cells. The process of translating a sequence of nucleoties into a functional protein is essentially the same as that of translating a C source code file into an functional program. This translation process is performed by another molecule called RNA polymerase (this molecule is a biological compiler). In order to find the code for a particular protein within a very long strand of DNA, the RNA polymerase molecule recognizes a particular sequence in the DNA called a promoter.
(Help, I don't understand biology!) In practical terms, your program will read in a sequence of letters made up of A,C,G, and Ts only. The letters stand for particular neucleoties adenine, cytosine, guanine and thymine that make up DNA. Your program will look for the pattern TTGACA in the sequence of letters. It will also look for the pattern TATAAT. If it finds both of these elements separated by exactly 19 other letters it will print "Gene found!", if they are not found it should print "No Gene found!".
Start:
Your program will start by printing the message: Please enter your DNA sequence (X to mark the end of the sequence).
Input:
Next the user will be permitted to enter the DNA sequence. The user can hit any keys on the keyboard. In response to these keys, your program should perform as follows: A,C,G,T Your program should add this letter to an array of characters (i.e. a string) stored inside your program.
space, tab, enter, return Your program should ignore these characters and just keep right on processing
X Your program should stop adding any more letters to its array and print out the results.
any other characters Your program should print an error message and exit with a return value of -1 (not zero!).
Scope:
The maximum length of a DNA sequence allowed will be 512 letters. If a user enters more than 512 letters (not counting spaces, tabs, enters, returns), the your program should print an error message and exit with a return value of -2 (not zero!). This means that you can declare your variable as an array of 512 char.
Operating Hints:
The easiest way to implement this program is to start by writing a program that only reads in the input. I would use scanf with %c to read one letter at a time. A few if statements (this time with an else) can be used to identify the kind of letter entered. You can pass the string around as a parameter put the good letters inside it. Once you have this part done, I would just print out the string (without spaces, tabs, etc.). The string should match the string that the user entered.
Searching for the pattens:
There are two patterns you need to search for: TTGACA and TATAAT separated by exactly 19 other symbols. You must write the function to search for these patterns yourself (you may not use a built-in search function or the strcmp function, etc.). The easiest way to do this is probably to write a function that accepts a parameter called position. It can then look for a "T" at position, another "T" at postion+1, a "G" at position+2, and so on. Then you can call that function from within a loop for each possible position. Watch that you don't fall off the end of the string. If you do not understand what that sentence there just meant, you need to understand it before you start the assignment.
Testing:
Make sure you test your program very carefully with a number of examples. Some things to consider: (a) what if the pattern is right at the beginning of the string, (b) what if the pattern is right at the end, (c) what if the string is exactly 512 characters long (and has the pattern right at the end), (d) what if the string doesn't have the pattern, (e) what if the string is less than 31 symbols long.
~~~~~
This what I have so far, I know it's not what it's supposed to be cause it doesn't work properly, but I was hoping someone could point me in the right direction and give me some hints on what i need to do. There's two pieces of code, relatively the same, neither work though.
Code:#include <stdio.h> int main() { char sequence[512]; int i, p, t; printf("Please enter your DNA sequence (X to mark the end of the sequence).\n"); for(i=0;i<512;i++ ) { scanf( "%c", &sequence[i]); if(sequence[i] == 'A' ) { sequence[i] = 'A'; } if(sequence[i] == 'C' ) { sequence[i] = 'C'; } if(sequence[i] == 'G' ) { sequence[i] = 'G'; } if(sequence[i] == 'T' ) { sequence[i] = 'T'; } if((sequence[i] == 9) || (sequence[i] == 13) || (sequence[i] == 32)) { i = i - 1; } if(sequence[i] == 'X' ) { if(i<19) { printf("The DNA sequence to have entered is too short for Analysis.\n"); return 0; } if(i>19) { if(i<=512) { p = i; printf("The DNA sequence you entered is as follows:\n"); for(i=0;i<p;i++) { printf("%c", sequence[i]); } } if(i>512) { return -2; } } } /* if((sequence[i] != 'A') || (sequence[i] != 'G') || (sequence[i] != 'C') || (sequence[i] != 'T') || (sequence[i] != 9) || (sequence[i] != 13) || (sequence[i] != 32) || (sequence[i] != 'X')) { printf("You have entered an invalid DNA sequence.\n"); return -1; } */ } t = p; for(i=0;i<=t;i++) { if((sequence [i]== 'T') && (sequence [i+1]== 'T') && (sequence [i+2]== 'G') && (sequence [i+3]== 'A') && (sequence [i+4]== 'C') && (sequence [i+5]== 'A') && (sequence [i+25]== 'T') && (sequence [i+26]== 'A') && (sequence [i+27]== 'T') && (sequence [i+28]== 'A') && (sequence [i+29]== 'A') && (sequence [i+30]== 'T')) { printf("Gene Found!\n"); } else { printf("No Gene Found!\n"); } } return 0; }Code:#include <stdio.h> int main() { char sequence[512]; int i, p, t; printf("Please enter your DNA sequence (X to mark the end of the sequence).\n"); for(i=0;i<512;i++ ) { scanf( "%c", &sequence[i]); if(sequence[i] == 'A' ) { sequence[i] = 'A'; } if(sequence[i] == 'C' ) { sequence[i] = 'C'; } if(sequence[i] == 'G' ) { sequence[i] = 'G'; } if(sequence[i] == 'T' ) { sequence[i] = 'T'; } if((sequence[i] == 9) || (sequence[i] == 13) || (sequence[i] == 32)) { i = i - 1; } if(sequence[i] == 'X' ) { if(i<19) { printf("The DNA sequence to have entered is too short for Analysis.\n"); return 0; } if(i>19) { if(i<=512) { p = i; printf("The DNA sequence you entered is as follows:\n"); for(i=0;i<p;i++) { printf("%c", sequence[i]); t = p; } for(i=0;i<=t;i++) { if(sequence [i]== 'T') { if(sequence [i+1]== 'T') { if(sequence [i+2]== 'G') { if(sequence [i+3]== 'A') { if(sequence [i+4]== 'C') { if(sequence [i+5]== 'A') { if(sequence [i+25]== 'T') { if(sequence [i+26]== 'A') { if(sequence [i+27]== 'T') { if(sequence [i+28]== 'A') { if(sequence [i+29]== 'A') { if(sequence [i+30]== 'T') { printf("\nGene Found!\n"); return 0; } } } } } } } } } } } } else { printf("\nNo Gene Found!\n"); return 0; } } } } if(i>512) { return -2; } } } /* if((sequence[i] != 'A') || (sequence[i] != 'G') || (sequence[i] != 'C') || (sequence[i] != 'T') || (sequence[i] != 9) || (sequence[i] != 13) || (sequence[i] != 32) || (sequence[i] != 'X')) { printf("You have entered an invalid DNA sequence\n"); return -1; } */ return 0; }