Once you successfully compile and run the HelloWorld program, you are ready to start dissecting the code to learn its individual parts. Of course, Listing 1.1 is the simplest of C# programs with only a single statement.
The single statement is Console.WriteLine(), which is used to write a line of text to the console. C# generally uses a semicolon to indicate the end of a statement, where a statement comprises one or more actions that the code will perform. Declaring a variable, controlling the program flow, and calling a method are typical uses of statements.
Many programming elements in C# end with a semicolon. One example that does not include the semicolon is a switch statement. Because curly braces are always included in a switch statement, C# does not require a semicolon following the statement. In fact, code blocks themselves are considered statements (they are also composed of statements), and they don’t require ending with a semicolon. Similarly, there are cases, such as the using declaration, in which a semicolon appears as a postfix, but it is not a statement.
Since creation of a newline does not separate statements, you can place multiple statements on the same line, and the C# compiler will interpret the line as having multiple instructions. For example, Listing 1.3 contains two statements on a single line that, in combination, display Up and Down on two separate lines.
Similarly, each statement can be placed on its own line, as shown in Listing 1.4.
C# also allows the splitting of a statement across multiple lines. Again, the C# compiler looks for a semicolon to indicate the end of a statement. In Listing 1.5, for example, the original WriteLine() statement from the HelloWorld program is split across multiple lines.
In the listings shown so far, the statements are independent of any other C# constructs and appear this way in only a single file. The previous listings are the simplest of C# programs, a HelloWorld program, after all. Programs can, however, get vastly more complicated, and structure can be added to organize the code. The simplest structure is to add methods and to place those methods within classes. Listing 1.6 provides an example.
In this listing, the HelloWorld statement is placed into a method called Main, which is placed into a class called Program.
Those experienced in programming with Java, C, or C++ will immediately see similarities. Like Java, C# inherits its basic syntax from C and C++.8 Syntactic punctuation (such as semicolons and curly braces), features (such as case sensitivity), and keywords (such as class, public, and void) are familiar to programmers experienced in these languages.
In Java, the filename must follow the name of the class. In C#, this convention is frequently followed but is not required. In C#, it is possible to have two classes in one file, and even to have a single class span multiple files, with a feature called a partial class.
To enable the compiler to interpret the code, certain words within C# have special status and meaning. Known as keywords, they provide the concrete syntax that the compiler uses to interpret the expressions the programmer writes. In the HelloWorld program, class, static, and void are examples of keywords.
The compiler uses the keywords to identify the structure and organization of the code. Because the compiler interprets these words with elevated significance, C# requires that developers place keywords only in certain locations. When programmers violate these rules, the compiler issues errors.
Keywords are another construct common to other programming languages. Table 1.1 shows the C# keywords.
abstract |
add*(1) |
alias*(2) |
and* |
args* |
as |
ascending*(3) |
async*(5) |
await*(5) |
base |
bool |
break |
by*(3) |
byte |
case |
catch |
char |
checked |
class |
const |
continue |
decimal |
default |
delegate |
descending*(3) |
do |
double |
dynamic*(4) |
else |
enum |
equals*(3) |
event |
explicit |
extern |
false |
file* |
finally |
fixed |
float |
for |
foreach |
from*(3) |
get*(1) |
global*(2) |
goto |
group*(3) |
if |
implicit |
in |
init*(9) |
int |
interface |
internal |
into*(3) |
is |
join*(3) |
let*(3) |
lock |
long |
nameof*(6) |
namespace |
new |
nint*(9) |
not* |
notnull*(8) |
null |
nunit*(9) |
object |
on*(3) |
operator |
or* |
orderby*(3) |
out |
override |
params |
partial*(2) |
private |
protected |
public |
readonly |
record* |
ref |
remove*(1) |
required*(11) |
return |
sbyte |
scoped* |
sealed |
select*(3) |
set*(1) |
short |
sizeof |
stackalloc |
static |
string |
struct |
switch |
this |
throw |
TRUE |
try |
typeof |
uint |
ulong |
unchecked |
unmanaged*(7.3) |
unsafe |
ushort |
* Contextual keyword. Numbers in parentheses (n) identify in which version the contextual keyword was added.
After C# 1.0, no new reserved keywords were introduced to C#. However, some constructs in later versions use contextual keywords, which are significant only in specific locations. Outside these designated locations, contextual keywords have no special significance.9 By this method, even C# 1.0 code is compatible with the later standards.10
Like other languages, C# includes identifiers to identify constructs that the programmer codes. In Listing 1.6, HelloWorld and Main are examples of identifiers. The identifiers assigned to a construct are used to refer to the construct later, so it is important that the names the developer assigns are meaningful rather than arbitrary.
Clarity coupled with consistency is important enough that the Framework Design Guidelines (http://bit.ly/dotnetguidelines) advise against the use of abbreviations or contractions in identifier names and even recommend avoiding acronyms that are not widely accepted. If an acronym is sufficiently well established (e.g., HTML), you should use it consistently. Avoid spelling out the accepted acronym in some cases but not in others. Generally, adding the constraint that all acronyms be included in a glossary of terms places enough overhead on the use of acronyms that they are not used flippantly. Ultimately, select clear, possibly even verbose names—especially when working on a team or when developing a library that other developers will use.
There are two basic casing formats for an identifier. Pascal case (henceforth PascalCase), as the .NET framework creators refer to it because of its popularity in the Pascal programming language, capitalizes the first letter of each word in an identifier name; examples include ComponentModel, Configuration, and HttpFileCollection. As HttpFileCollection demonstrates with HTTP, when using acronyms that are more than two letters long, only the first letter is capitalized. The second format, camel case (henceforth camelCase), follows the same convention except that the first letter is lowercase; examples include quotient, firstName, httpFileCollection, ioStream, and theDreadPirateRoberts.
Notice that although underscores are legal, generally there are no underscores, hyphens, or other nonalphanumeric characters in identifier names. Furthermore, C# doesn’t follow its predecessors in that Hungarian notation (prefixing a name with a data type abbreviation) is not used. This convention avoids the variable rename that is necessary when data types change, or the inconsistency introduced due to failure to adjust the data type prefix when using Hungarian notation.
In rare cases, some identifiers, such as Main, can have a special meaning in the C# language.
While naming guidelines may seem relatively trivial, especially when coming from other languages where such guidelines are ambiguous or missing entirely, violations of these guidelines will be glaring to an experienced C#/.NET programmer and indicate poor quality or inexperience. To avoid this, learn the guidelines, and follow them religiously.
Although it is rare, keywords may be used as identifiers if they include @ as a prefix. For example, you could name a local variable @return. Similarly (although it doesn’t conform to the casing standards of C# coding standards), it is possible to name a method @throw, with parentheses following: @throw().
There are also four undocumented reserved keywords in the Microsoft implementation: __arglist, __makeref, __reftype, and __refvalue. These are required only in rare interop scenarios, and you can ignore them for all practical purposes. Note that these four special keywords begin with two underscores. The designers of C# reserve the right to make any identifier that begins with two underscores into a keyword in a future version; for safety, avoid ever creating such an identifier yourself.
A class definition is the section of code that generally begins with class <identifier> { ... }, as shown in Listing 1.7, where HelloWorld is the identifier.
The name used for the type (in this case, HelloWorld) can vary, but by convention, it must be PascalCased. For this example, therefore, other possible names are Greetings, HelloInigoMontoya, Hello, or simply Program. (Program is a good convention to follow when the class contains the Main() method, described next.)
Generally, programs contain multiple types, each containing multiple methods.
Syntactically, a method in C# is a named block of code introduced by a method declaration (e.g., static void Main()) and (usually) followed by zero or more statements within curly braces. Methods perform computations and/or actions. Like paragraphs in written languages, methods provide a means of structuring and organizing code so that it is more readable. More important, methods can be reused and called from multiple places and so avoid the need to duplicate code. The method declaration introduces the method and defines the method name along with the data passed to and from the method. In Listing 1.8, Main() followed by { ... } is an example of a C# method.
The location where C# programs begin execution is the Main method, which begins with static void Main(). When you execute the program by typing dotnet run on the terminal, the program starts with the Main method and begins executing the first statement, as identified in Listing 1.8.
Although the Main method declaration can vary to some degree, static and the method name, Main, are always required for a program (see “Advanced Topic: Declaration of the Main Method”).
The comments, text that begins with // in Listing 1.8, are explained later in the chapter. They are included to identify the various constructs in the listing.
C# requires that the Main method return either void or int and that it take either no parameters or a single array of strings. Listing 1.9 shows the full declaration of the Main method. The args parameter is an array of strings corresponding to the command-line arguments. The executable name is not included in the args array (unlike in C and C++). To retrieve the full command used to execute the program, including the program name, use Environment.CommandLine.
The int returned from Main() is the status code, and it indicates the success of the program’s execution. A return of a nonzero value generally indicates an error.
C# 7.1 also added support for async/await on the Main method, in which case you would use Task-based types in the return.
The designation of the Main method as static indicates that other methods may call it directly off the class definition. Without the static designation, the terminal that started the program would need to perform additional work (known as instantiation) before calling the method. (Chapter 6 contains an entire section devoted to the topic of static members.)
Placing void prior to Main() indicates that this method does not return any data. (This is explained further in Chapter 2.)
One distinctive C/C++-style characteristic followed by C# is the use of curly braces for the body of a construct, such as the class or the method. For example, the Main method contains curly braces that surround its implementation; in this case, only one statement appears in the method.
Whitespace is the combination of one or more consecutive formatting characters such as tab, space, and newline characters. Eliminating all whitespace between words is obviously significant, as is including whitespace within a quoted string.
The semicolon makes it possible for the C# compiler to ignore whitespace in code. Apart from a few exceptions, C# allows developers to insert whitespace throughout the code without altering its semantic meaning. In Listing 1.8 and Listing 1.9, it didn’t matter whether a newline was inserted within a statement, between statements, or eliminated entirely, and doing so had no effect on the resultant executable created by the compiler.
Frequently, programmers use whitespace to indent code for greater readability. Consider the two variations on HelloWorld shown in Listing 1.10 and Listing 1.11. Although these two examples look significantly different from the original program, the C# compiler sees them as semantically equivalent.
Indenting the code using whitespace is important for greater readability. As you begin writing code, you need to follow established coding standards and conventions to enhance code readability.
The convention used in this book is to place curly braces on their own line and to indent the code contained between the curly brace pair. If another curly brace pair appears within the first pair, all the code within the second set of braces is also indented.
This is not a uniform C# standard but a stylistic preference.
________________________________________