What is Character Set in C++

The character set in C++ consists of upper and lower case alphabets, digits, special characters and white spaces. The alphabets and digits together constitute the alphanumeric set. Note: The compiler ignores white spaces unless they are a part of a string constant. The following are the character set in C++

Alphabets: Uppercase : A B C ........Z Lowercase : a b c ........z
Digits: 0 1 2 3 4 5 6 7 8 9
Special Characters: , < > . _ ( ) : $ ^ ? & * { } [ ] / \ " ' ! ~ | - % # , etc.
White space characters: blank space, newline, carriage return, formfeed, horizontal tab, vertical tab 

Tokens, Identifiers and Keywords

Tokens are the smallest individual unit in a program. C++ program consist of many elements, which are identified by the compiler as tokens. In C++ Tokens are categorized as:

  • Keywords
  • Identifiers
  • Constants
  • Special characters
  • Operators


Keywords have predefined meaning and cannot be changed by the user. Keywords are declared by the C++ language and have predefined meaning. Keywords are never used for any other purpose other than C++ specified compiler work. The following are the keywords used in C++

do ifstaticwhile
Keywords common to C and C++

C++ Specific Keywords

There are several keywords specific to C++ which are listed below. These keywords primarily deal with classes, templates, and exception handling. They are as under:

Keywords specific to C++


Identifier name is formed by using alphabets, digits, or underscore character. Identifiers are used to identify or name variables, symbolic constants, functions and so on. Note: The maximum number of characters used in forming an identifier must not exceed 31 characters. Some compilers allow identifier length to be more than 31 characters. However, first 31 characters are significant.

C++ is case sensitive ( since the upper and lower-case letters are treated differently). For example: rate and RATE are treated as different identifiers.

Variables: A variable is an entity whose value varies during the program execution and is known to the program by a name. A variable is always associated with a memory location which maps to a definite variable name.A variable can hold only one value at a time during the program execution. The following are the different components associated with variable.

  • Data type: char, int, float, date (user-defined), etc.
  • Variable name: user view
  • Binding address: machine view
  • Value: data stored in memory location

Variable Names

Variable names are identifiers used to name variables. They are the symbolic names assigned to the memory location. A variable name consists of a sequence of letters and digits where first character must starts with a letter. The following are some valid variables names:

  • i
  • sum
  • class_rollno
  • classRollno
  • StudentName
  • emp_num
  • _num
  • rankl_xl

The following are some invalid variable names

  • a’ s (illegal character ‘ )
  • roll number (blank space not allowed)
  • 5Student ( first character should be a letter)
  • student,record ( comma not allowed)