When you write a program, you express C source files as text lines containing characters from the character set. When a program executes in the target environment, it uses character from the character set. These character sets are related, but need not have the same encoding or all the same members.
(see the below frame [figure], if not visible please click here : http://mcs011.blogspot.com/p/basic-character-set-in-c-variables.html )
Every character set contains a distinct code value for each character in the basic C CHARACTER SET. A character set can also contain additional characters with other code values. The C language character set has alphabets, numbers, and special characters as shown below:
1.
Alphabets including both lowercase and uppercase
alphabets A, B, C, .......Z and a, b, c ...... z
2.
Numbers : 0, 1, 2, 3, ............. 9
3. Special Characters :
; : { } , ' " | > <
/ \ ~ - _ [ ] ! $ ?
* + = ( ) % # ^ @ &
----------------------------------
3.3 IDENTIFIERS AND KEYWORDS
----------------------------------
Identifiers are names given to various program elements such as constants, variables, function names and arrays etc. Every element in the program has its ownn distinct name but one cannot select any name unless it conforms to valid name in C language. Let us study first the rules of define names or identifiers.
3.3.1 RULES FOR FORMING IDENTIFIERS : Identifiers are defined according to the following rules :
- It consists of LETTERS and DIGITS.
- First Character must be an Alphabet or Underscore.
- Both --- UPPER and LOWER Cases are allowed. Same text of different case is not equivalent, for example TEXT is not same as text.
- Except the special character underscore ( _ ), no other special symbols can be used.
Example of some VALID IDENTIFIERS are shown below :
X X123 _XI temp tax_rate
Example of some INVALID IDENTIFIERS are shown below :
123 Firset Character to be ALPHABET
"X." Not Allowed
order-no Hyphen allowed
error flag Blank Space Allowed
--------------------------------------
3.3.2 KEYWORDS :
--------------------------------------
Keywords are
RESERVED WORDS which have standard, predefined meaning in C. They cannot be used as program-defined identifiers.
C89 has 32 keywords (reserved words with special meaning):
auto | double | int | struct |
break | else | long | switch |
case | enum | register | typedef |
char | extern | return | union |
const | float | short | unsigned |
continue | for | signed | void |
default | goto | sizeof | volatile |
do | if | static | while |
C99 adds five more keywords:
_Bool | inline |
_Complex | restrict |
_Imaginary |
|
C1X adds seven more keywords:
_Alignas | _Generic | _Static_assert |
_Alignof | _Noreturn | _Thread_local |
_Atomic |
|
|
NOTE : Generally all keywords are in lower case although uppercase of same names can be used as identifiers.
-------------------------------------------------
3.4 DATA TYPE AND STORAGE :
------------------------------------------------
To store data inside the computer we need to first identify the type of data elements we need in our program. There are several different types of data, which may be represented differently within the computer memory. The data type specifies two things :
1. Permissible range of values that it can store.
2. Memory requirement to store a data type.
C Language provides four basic Data Type -- viz.
int, char, float and double. Using these, we can store data in simple ways as single elements or we can group them together and use different ways (to be discussed later) or store them as per requirement.
BELOW : TABLE SHOWING DIFFERENT DATA TYPES IN C :-
The four basic data types are described in the following Table :
Basic Data Types :
=========================================================
DATA FULL NAME OF MEMORY RANGE
TYPE DATA TYPE
------ ------------- ------- -----------------
int Integer 2 Bytes -32,768 to 32,767
char character 1 Byte -128 to 128
float Floating point 4 Bytes 3.4e -38 to
number 3.4e +38
double Floating point 8 Bytes 1.7e - 308 to
with higher 1.7e + 308
precision
=========================================================
Memory requirements of size of data associated with a data type indicates the range of numbers that can be stored in the data item of that type.
-------------------------------------------------------------
3.5 DATA TYPE QUALIFIERS :
-------------------------------------------------------------
[Machine data types - All data in computers based on digital electronics is represented as bits (alternatives 0 and 1) on the lowest level. The smallest addressable unit of data is usually a group of bits called a byte (usually an octet, which is 8 bits). The unit processed by machine code instructions is called a word (as of 2011, typically 32 or 64 bits). Most instructions interpret the word as a binary number, such that a 32-bit word can represent unsigned integer values from 0 to 232 − 1 or signed integer values from − 231 to 231 − 1. Because of two's complement, the machine language and machine doesn't need to distinguish between these unsigned and signed data types for the most part]
Short, Long, Signed, Unsigned are called the data type qualifiers and can be used with any data type. A short int requires less space than int and long int may require more space than int.
If ---------- int and short int -- takes-- 2 bytes, *
then ----- long int --------------- takes-- 4 bytes.
* NOTE : DISCUSSION :
* In some embedded applications an int is 32b and a short in is 16b a word
is also 16b but isn't treated the same as an int by the compiler.
* The ISO C standard says the *minimum* range of short is -32767 to +32767.
It can have greater range. The standard says short must be a subrange of int.
They can both be 32-bit, for example. Be careful with the term "byte" in C.
It usually means 8 bits, but it can be larger.
* On a 16-bit processor, int and short int are both 2 bytes,
whereas on a 32-bit processor, int = 4 bytes and short int = 2 bytes
Unsigned bits use all bits for magnitude;
therefore, this type of number can be larger. For example signed int range from -range from -32768 to +32767 and unsigned int ranges from 0 to 65,535. Similarly, char data type of data is used to store a character. It requires 1 byte. Signed char values range from -128 to 127 and unsigned char value range from 0 to 255. These can be summarized as under :
------------------------------------------------------------
Data Type Size(bytes) Range
------------------------------------------------------------
Short int or int 2 -32768 to 32,767
Long int 4 -2147483648 to 2147483647
Signed int 2 -32768 to 32767
Unsigned int 2 0 to 65535
Signed char 1 -128 to 127
Unsigned char 1 0 to 255
Boolean (1 Bit) True / False
------------------------------------------------------------
A SAMPLE PROGRAM SHOWING VARIOUS DATA TYPES
Sample Code
#include <stdio.h>
void main()
{
int a = 3944;
long b = -199020930;
double c = 7.534e-10;
double * d = &c;
_Bool ba = a;
_Bool bb = b;
_Bool bc = c;
_Bool bd = d;
_Bool be = ( 1 == 2 );
printf( "ba = %dn", ba );
printf( "bb = %dn", bb );
printf( "bc = %dn", bc );
printf( "bd = %dn", bd );
printf( "be = %dn", be );
}
----------------------
3.6 VARIABLES
----------------------
Variable is an identifier whose value changes from time to time during execution.
It is a named data storage location in your computer's memory. By using a variable's name in your program, you are, in effect, referring to the data stored there. A variable represents a single data item i.e. a
numeric quantity or a
character constant or a
string constant. Note that a value must be assigned to the variables at some point of time in the program which is termed as assignment statement. The variable can then be accessed later in the program. If the variable is accessed before it is assigned a value it may give garbage value. The data type of a variable doesn't change whereas the value assigned to can change.
All variables have three essential (मूलभूत) attributes (विशेषताएँ):
- the name -------- (a, b, c, d, -------- z, ...... r1, r2, r3, ........., sohan, jon, marry etc.)
- the value -------- (any Real Number or Integer etc.)
- the memory, where the value is stored. ---- (Memory Addresses, Flags Registers etc.)
For example, in the following C PROGRAM a, b, c, d are the variables but variable e is not declared and is used before declaration. After compiling the source code the look what gives ?
main()
{
int a, b, c;
char d;
a=3;
b=5;
c=a+b;
d='a';
e=d;
getch();
}
CLICK THE ABOVE IMAGE TO ENLARGE
-----------------------------------------------------------
3.7 DECLARING VARIABLES :
-----------------------------------------------------------
Before any data can be stored in the memory, we must assign a name to these
locations of memory (MEMORY LOCATION).
For this we make declarations.
Declarations associates a group of identifiers with a specific data type. All of them need to be declared before they appear in program statements, else
accessing(aksessing=तक पहुँचने, ऐक्सेस करना) the the variables resulting in
JUNK VALUES or a diagnostic error. The syntax for declaring variables is as follows :
data- type variable-name(s);
For example,
int a;
short int a,b;
int c,d;
long c,f;
float r1,r2;
----------------------------------------------------------------
3.8 INITIALISING VARIABLES :
-----------------------------------------------------------------
When variables are declared initial (प्रारंभिक, प्रारम्भिक, प्राथमिक), values can be assigned (
सौंपा गया) to them in two ways:
a) Within a Type declaration ------- The value is assigned at the declaration time.
Example,
int a=10;
float b=0.4e -5;
char c='a';
b) Using Assignment Statement ---- The values are assigned just after the declarations are made.
Example,
a=10;
b=0.4e -5;
c='a';
--------------------------
CHECK YOUR PROGRESS - 1
--------------------------
1. Identify Keywords and Valid Identifiers among the following :
Ans.
Key Words : int, union
Valid Identifiers : hello, student_1, max_value
2. Declare type variables for roll no, total_marks and percentage.
Ans.
{
int roll_no;
float total_marks, percentage;
-----------
-----------
}
3. How many bytes are assigned to store for the following ? :
Ans. Unsigned Character : 1 Bytes
Unsigned Integer : 2 Bytes
Double : 8 Bytes
------------------------------------
3.9 CONSTANTS :
------------------------------------
A constant is an identifier whose value can not be changed throughout the execution of a program whereas the variable value keeps on changing.
CONSTANTS
╔--------------------------┴-------------------┐
| | | |
1. 2. 3. 4.
INTEGER FLOATING POINT CHARACTER STRING
CONSTANTS | CONSTANTS CONSTANTS CONSTANTS
| | | |
1-Decimal 1-Valid Floating |-Single |-ENCLOSED
| | Point Numbers | Char. | WITH
2-Invalid | | | DOUBLE
| Decimal | |-Zero is | QUOTES
| 2-Invalid Floating | NULL |
3-Octal Int. | Point Numbers | Char. |
| Constant | e.g.
| |
4-Hexadecimal |-"red"
| Integer | "4121*(I+3)"
| Constants
|-Valid Hexadecimal Integer Constants
|-Invalid Hexadecimal Integer Constants
||||||||||||||||||||||||||||||||
|| Unsigned Integer Constants ||
|| Long Integer Constants ||
||||||||||||||||||||||||||||||||
[ THERE ARE TWO MAJOR CATEGORIES OF CONSTANTS ]
In C there are four basic types of constants. They are :
- Integer Constants
- Floating Point Constants
- Character Constants
- String Constants
3.9.1 INTEGER CONSTANTS :
**********************************************************
1. Decimal Integer Constants - 0 - 9, first digit should not be zero.
2. Invalid Decimal Integer Constants - 12,45 ,not allowed
36.0 Illegal character
1010 Blank-space not allowed
10-10 Illegal Character
0900 First Digit should not be Zero
3. Octal Integer Constants - 0 - 7 0, 1, 2, 3, 4, 5, 6, 7
Consist of 0 - 7 The First Digit
Must Be Zero. (e.g. 0 01 0743 0777)
4. Hexadecimal Integer Constants - These constants begin with 0x or OX and
are followed by combination of digits taken
from hexadecimal digits 0 - 9, a to f or A to F.
Valid Hexadecimal Int. Constants : OX0 OX1 OXF77 Oxabcd
Invalid Hexa. Int. Constants : OBEF - x is not included
Ox.4bff - Illegal Character ( . )
OXGBC - Illegal Character G
3.2 FLOATING POINT CONSTANTS :
************************************************************************
What is a base 10 number containing decimal point or an exponent (प्रतिपादक, घातांक).
a. Valid floating point numbers are :
0. 1.
000.2 5.61123456
50000.1 0.000741
1.6667E+30.006e-3
b. Invalid floating point numbers are :
1
1,00.0
2E+10.2 exponent is written after int
3E 10 no blank space
e.g. - A Floating Point number taking the value of 5x10
4 can be represented as:
5000. -------> written as ----> 5e4
5e+4 -------> written as ----> 5E4
5.0e+4 -------> Written as ----> .5e5
The magnitude of floating point number range from 3.4E - 38 to a maximum of 3.4E+38, through 0.0 They are taken as double precision numbers. Floating point constants occupy 2 words = 8 bytes.
3.9.3 CHARACTER CONSTANTS :
******************************************
This constant is a single character enclosed in apostrophes ' '.
For example some of the character constants are shown below :
'A', 'X', '3', '$'
'\0' is a null character having value zero.
Character constants have integer values associated depending on the character set adopted for the computer. ASCII character set is in use which use 7-bit code with 2
7=128 different different characters. The digits 0-9 are having ASCII value of 48-56 and 'A' have ASCII value from 65 and 'a' having value 97 are sequentially ordered. For example :
'A' has 65, blank has 32
ESCAPE SEQUENCE :
[An escape sequence is a series of characters used to change the state of computers and their attached peripheral devices. These are also known as control sequences, reflecting their use in device control. Some control sequences are special characters that always have the same meaning. Escape sequences use an escape character to change the meaning of the characters which follow it, meaning that the characters can be interpreted as a command to be executed rather than as data.]
There are some non-printable characters that can be printed by preceding them with '\' backslash character. Within character constants and string literals, you can write a variety of
escape sequences. Each escape sequence determines the code value for a single character. You can use escape sequences to represent character codes:
- you cannot otherwise write (such as \n)
- that can be difficult to read properly (such as \t)
- that might change value in different target character sets (such as \a)
- that must not change in value among different target environments (such as \0)
The following is the list of escape sequences :
-------------------------------------------------------
Char- Escape Char- Escape
acter Sequence acter Sequence
-------------------------------------------------------
" \" FF \f Form Feed
' \' NL \n New Line
? \? CR \r
\ \\ HT \t
BEL \a Beep Sound VT \v
BS \b Back Space
-------------------------------------------------------
3.9.4 STRING CONSTANTS : It consists of sequence of characters enclosed within
double quotes. For example,
"red" "Blue Sea" "41213*(I+3)".
------------------------------------------------------------
3.10 SYMBOLIC CONSTANTS :
------------------------------------------------------------
Symbolic Constant (प्रतीकात्मक स्थिरांक) is a name that
substitutes(की जगह लेना) for a sequence of characters or a numeric constant, a character constant or a string constant. When program is compiled each occurance of a symbolic constant is replaced by its corresponding character sequence. The syntax is as follows:
EXAMPLE : ------> ध्यान दें की इधर #define के सामने DA सिम्बोलिक नाम का प्रयोग किया गया है - DA = Dearness Allowance यानि महँगाई भत्ता = 10 दिया गया है | अब मान लीजिये की आप salary का योग निकाल रहे हैं तो salary=DA+a+b में बार बार 10 लिखने की बजाये DA डेफीनेशन का प्रयोग कर सकते हैं |
NOTE : #define के बाद ; सेमी कोलोन प्रयोग नहीं किया जाता !
Don't use ; Semi Colon after #define text
आइये एक और उदहारण से इसको स्पष्ट करते हैं :
------------------------------------------------>
ध्यान दें की इधर हमने
DA= Basic / 10 माना है अर्थात इधर हमने
DA (महंगाई भत्ते) को basic का दसवां भाग मान लिया है जबकि
BASIC SALARY = 2000 मान लेते है - तो
टोटल सलरी = बेसिक + DA (यही define किया गया CONSTANT (स्थिरांक)
NOTE : You can use #define where you need long mathematical calculations etc.
यदि हमे कहीं भी बड़े - और लम्बे Mathematical गणनाएं करनी हों तो #define का
प्रयोग कर सकते हैं !
NOTE : You can also use
#define AND && (Instead of using && two Ampersand - Symbols for AND OPERATOR)
#define EQL == (दो equal चिन्हों के स्थान पर एक आसान शब्द EQL का प्रयोग कर सकते हैं)
#define OR || (instead of using two pipe lines you can use OR eazy word)
EXAMPLE : We can also use as :
#define salary(a) a + 200
if we want to add Rs. 200 at the time of salary counting or in the
printf("THE SALARY IS %d,salary(2000));
HENCE THE RESULT IS :
2500
EXAMPLE : #define function for using
SHORT CHARACTER NAMES in place of
LONG NAMES ----------------------------->>>>
मान लीजिये की हमको बार बार किसी प्रोग्राम में Thanks for Joining ------ Thanks for Joining ---- प्रयोग करना होता है ? --
तो बजाये बार बार Thanks for Joining लिखने के हम केवल एक बार #define का प्रयोग कर के --
#define THNX "Thanks for Joining" का प्रयोग कर सकते है इस से प्रोग्राम लिखते हुए समय की बचत होती है !
NOTE : Please note that in
printf we are not using
" " Double Quotes as they are already used above in defining THNX as ----->>>>
#define THNX "Thankx for Joining"
नोट : कृपया ध्यान दें की इधर हमने printf में
" " डबल कोट्स का प्रयोग नहीं किया है क्यौंकी डबल कोट्स पाहिले से ही THNX को define करते समय प्रयोग हो चूका है !
The # character is used for pre-processor commands. A preprocessor is a system program, which comes into action prior to Compiler, and it replace the replacement text by the actual text. This will allow correct use of the statement printf.
ADVANTAGES OF USING SYMBOLIC CONSTANTS ARE :
- They can be used to assign names to values
- Replacement of value has to be done at one place.
- Wherever the name appears in the text it gets the value by execution of the pre-processor.
- This saves time. If the SYMBOLIC CONSTANT appears 20 TIMES in the program; it needs to be changed at one place only.
-------------------------------------------
CHECK YOUR PROGRESS 2 :
-------------------------------------------
Q.1 Write a preprocssor directive statement to define a constant PI having the value 3.14
Ans: #define PI 3.14
Q.2 Classify the examples into Integer, Character and String constants.
'A' 0147 0xEFH 077.7 "A"
26.4 "EFH" '\r' abc
Ans:
INTEGER = 0147
STRING = "EFH" "A"
CHARACTER CONSTANTS = 'A' '\r'
Q.3 Name different categories of Constants :
Ans: Basically of Four Types :
---------------------------------------------------
1. 2. 3. 4.
Integer Floating Point Character String
-------------- -------------- --------- ------
1-Decimal Int. Valid Flt.Pnt. \0 is Null "xyz"
2-Invalid Dec. Invalid Flt.Pnt.
3-Oct.Int. Escape
4-Hex. Int Sequence
1-Valid Hex.
2-Invalid Hex
5-Unsigned Int.
6-Long Int.Const.
===========================================================
3.11 SUMMARY :
----------------------------------------------------------------------------------------
To summarize we have learnt certain basics, which are required to learn a computer language and form a basis for all languages. Character set includes alphabets, numeric characters, special characters and some graphical characters. These are used to form words in C language or names or identifiers. Variable are the identifiers, which change their values during execution of the program. Keywords are names with specific meaning and cannot be used otherwise.
We had discussed for basic data types - int, char, float and double. Some qualifiers are used as prefixes to data types like signed, unsigned, short and long.
The constants are the fixed values and may be either Integer /and /or Floating Point or Character or String Type. Symbolic Constants are used to define names used for constant values. They help in using the name rather bothering with remembering and writing the values.