Now there are two files:
file1:
cigar::50 HWI-EAS-249_35:6:1:6:1154#0/2 76 32 - chromosome06 8365857 8365901 +
....
....
....
file2:
@HWI-EAS-249_35:6:1:6:1154#0/2
GGGGGGCTGAGAAGGTTGAGACAAGTAAGGTATTTCTACGTGATACTAGT GTTATTTCTCCTTACTCGCTCCTTCT
+
BBBBBC@CC@B?@C@;A>:<58@><4;>>@*8@AB;B;6>7>3=66?315 <8@8@@?/>47?;88@%%%%%%%%%%
....
....
....
....
these two files have more than 3,000,000 lines ,I want to produce the structure(as follows)
HWI-EAS-249_35:6:1:6:1154#0/2 76 32 - chromosome06 8365857 8365901 + GGGGGGCTGAGAAGGTTGAGACAAGTAAGGTATTTCTACGTGATACTAGT GTTATTTCTCCTTACTCGCTCCTTCT
.....
......
.....
according to keyword search the sequences,
As I think , I want to create hash table to store the data so that giving me a keyword I can find its sequences quickly!
Have best method to complete this job?
the keyword is the second column every line of the file1