I don’t understand why we need the ‘new’ keyword

There are a lot of misconceptions here, both in the question itself and in the several answers.

Let me begin by examining the premise of the question. The question is “why do we need the new keyword in C#?” The motivation for the question is this fragment of C++:

 MyClass object; // this will create object in memory
 MyClass* object = new MyClass(); // this does same thing

I criticize this question on two grounds.

First, these do not do the same thing in C++, so the question is based on a faulty understanding of the C++ language. It is very important to understand the difference between these two things in C++, so if you do not understand very clearly what the difference is, find a mentor who can teach you how to know what the difference is, and when to use each.

Second, the question presupposes — incorrectly — that those two syntaxes do the same thing in C++, and then, oddly, asks “why do we need new in C#?” Surely the right question to ask given this — again, false — presupposition is “why do we need new in C++?” If those two syntaxes do the same thing — which they do not — then why have two syntaxes in the first place?

So the question is both based on a false premise, and the question about C# does not actually follow from the — misunderstood — design of C++.

This is a mess. Let’s throw out this question and ask some better questions. And let’s ask the question about C# qua C#, and not in the context of the design decisions of C++.

What does the new X operator do in C#, where X is a class or struct type? (Let’s ignore delegates and arrays for the purposes of this discussion.)

The new operator:

  • Causes a new instance of the given type to be allocated; new instances have all their fields initialized to default values.
  • Causes a constructor of the given type to be executed.
  • Produces a reference to the allocated object, if the object is a reference type, or the value itself if the object is a value type.

All right, I can already hear the objections from C# programmers out there, so let’s dismiss them.

Objection: no new storage is allocated if the type is a value type, I hear you say. Well, the C# specification disagrees with you. When you say

S s = new S(123);

for some struct type S, the spec says that new temporary storage is allocated on the short-term pool, initialized to its default values, the constructor runs with this set to refer to the temp storage, and then the resulting object is copied to s. However, the compiler is permitted to use a copy-elision optimization provided that it can prove that it is impossible for the optimization to become observed in a safe program. (Exercise: work out under what circumstances a copy elision cannot be performed; give an example of a program that would have different behaviours if elision was or was not used.)

Objection: a valid instance of a value type can be produced using default(S); no constructor is called, I hear you say. That’s correct. I didn’t say that new is the only way to create an instance of a value type.

In fact, for a value type new S() and default(S) are the same thing.

Objection: Is a constructor really executed for situations like new S(), if not present in the source code in C# 6, I hear you say. This is an “if a tree falls in the forest and no one hears it, does it make a sound?” question. Is there a difference between a call to a constructor that does nothing, and no call at all? This is not an interesting question. The compiler is free to elide calls that it knows do nothing.

Suppose we have a variable of value type. Must we initialize the variable with an instance produced by new?

No. Variables which are automatically initialized, such as fields and array elements, will be initialized to the default value — that is, the value of the struct where all the fields are themselves their default values.

Formal parameters will be initialized with the argument, obviously.

Local variables of value type are required to be definitely assigned with something before the fields are read, but it need not be a new expression.

So effectively, variables of value type are automatically initialized with the equivalent of default(S), unless they are locals?

Yes.

Why not do the same for locals?

Use of an uninitialized local is strongly associated with buggy code. The C# language disallows this because doing so finds bugs.

Suppose we have a variable of reference type. Must we initialize S with an instance produced by new?

No. Automatic-initialization variables will be initialized with null. Locals can be initialized with any reference, including null, and must be definitely assigned before being read.

So effectively, variables of reference type are automatically initialized with null, unless they are locals?

Yes.

Why not do the same for locals?

Same reason. A likely bug.

Why not automatically initialize variables of reference type by calling the default constructor automatically? That is, why not make R r; the same as R r = new R();?

Well, first of all, many types do not have a default constructor, or for that matter, any accessible constructor at all. Second, it seems weird to have one rule for an uninitialized local or field, another rule for a formal, and yet another rule for an array element. Third, the existing rule is very simple: a variable must be initialized to a value; that value can be anything you like; why is the assumption that a new instance is desired warranted? It would be bizarre if this

R r;
if (x) r = M(); else r = N();

caused a constructor to run to initialize r.

Leaving aside the semantics of the new operator, why is it necessary syntactically to have such an operator?

It’s not. There are any number of alternative syntaxes that could be grammatical. The most obvious would be to simply eliminate the new entirely. If we have a class C with a constructor C(int) then we could simply say C(123) instead of new C(123). Or we could use a syntax like C.construct(123) or some such thing. There are any number of ways to do this without the new operator.

So why have it?

First, C# was designed to be immediately familiar to users of C++, Java, JavaScript, and other languages that use new to indicate new storage is being initialized for an object.

Second, the right level of syntactic redundancy is highly desirable. Object creation is special; we wish to call out when it happens with its own operator.

Leave a Comment