You need to define the problem more precisely. Suppose you have the following text
a b c aa a b c bb a b c cc a b c aaa a b c bbb a b c ccc
And the words a, b, c, aa, bb, cc, ... are words in English. Which of those words are "the most common" in the text? One way might be to make a list of each word encountered in the text along with a count
a: 6
b: 6
c: 6
aa: 1
...
Then you can sort the list in order of decreasing occurence.
Also, you need to clarify what you mean by accents and conjunctions. If you have this text
a b c à b c á b c
Then you have only to decide how you wish to count the different variants of a. Are they all the same word? Or they each distinct words? The same with conjunctions.
it's it is don't do not
I would count that as 6 distinct and different words: "it's" != "it is". How else would you do it?