CSE NotesCSE Notes
Simplifying Complexity

In C programming, the character set refers to the set of characters that can be used to write programs. Identifiers are names used to identify variables, functions, arrays, and other user-defined items in a C program. Here’s a breakdown of the character set and the rules for creating identifiers:

Character Set in C

  1. Alphanumeric Characters:
    • Includes letters (both uppercase and lowercase: A-Z, a-z) and digits (0-9).
  2. Special Characters:
    • The underscore (_) is used in identifiers. Other special characters (like @, $, #, etc.) are not allowed in identifiers.
  3. Whitespace:
    • Spaces, tabs, and newlines are used for separating tokens in code but are not part of identifiers.

Rules for Identifiers

  1. Start with a Letter or Underscore:
    • Identifiers must begin with a letter (uppercase or lowercase) or an underscore. They cannot start with a digit.
  2. Subsequent Characters:
    • After the first character, identifiers can include letters, digits, and underscores.
  3. Case Sensitivity:
    • Identifiers are case-sensitive. For example, Variable, variable, and VARIABLE are considered different identifiers.
  4. Length:
    • While the C standard allows identifiers to be quite long, it’s good practice to keep them reasonable (typically fewer than 32 characters) for readability.
  5. No Reserved Keywords:
    • Identifiers cannot be the same as C reserved keywords (like int, return, if, etc.).

Examples of Valid and Invalid Identifiers

  • Valid Identifiers:
    • myVariable
    • count1
    • _temp
    • MAX_SIZE
  • Invalid Identifiers:
    • 1stVariable (cannot start with a digit)
    • my-variable (hyphen is not allowed)
    • int (reserved keyword)
    • my variable (spaces are not allowed)

Summary

  • Character Set: Alphanumeric characters and underscores.
  • Identifiers: Must start with a letter or underscore, can contain letters, digits, and underscores, are case-sensitive, and cannot be reserved keywords.