    what is token in compiler?

    any "item" that the compiler cares about, such as keywords, names of variables and types, symbols (+, -, --, ++, *, &, &&, (, ), [, ], etc).

    A token is a lexical atom.

    The input to a compiler is a stream of characters. Traditionally the compiler's front end is divided into two parts (at least conceptually): the "lexer" and the "parser." The lexer takes the input characters and separates them into pieces called "tokens" which become input to the parser.

    For instance the numeric literal "123", while composed of three distinct characters, comprises a single lexical element, i.e. a token. The parser phase of the compiler receives this element as an atomic unit, not caring about its composition of characters.

    More formally, the token is an abstract object representing an atomic unit of parser input -- the actual characters which comprise the token form a "lexeme."

    Tokens might not even correspond to actual pieces of input. Although this doesn't happen in C and C++, in Python for instance the lexer analyzes indentation and produces "INDENT" and "DEDENT" tokens when the indentation level changes. These tokens don't directly correspond to any particular character or sequence of characters in the input.
