Newbie Questions & Answers
What is this “newbie section” all about?
It’s a randomly ordered collection containing a few questions newbies might ask.
- This section doesn’t pretend to be organized. Think of it as random. In truth, think of it as a hurried, initial cut by a busy guy.
- This section doesn’t pretend to be complete. Think of it as offering a little help to a few people. It won’t help everyone and it might not help you.
Hopefully someday we’ll be able to improve this section, but for now, it is incomplete and unorganized. If that bothers
you, my suggestion is to click that little x
on the extreme upper right of your browser window :-)
.
Where do I start?
Read the FAQ, especially the section on learning C++, and read books plural.
But if everything still seems too hard, if you’re feeling bombarded with mysterious terms and concepts, if you’re wondering how you’ll ever grasp anything, do this:
- Type in some C++ code from any of the sources listed above.
- Get it to compile and run.
- Repeat.
That’s it. Just practice and play. Hopefully that will give you a foothold.
Here are some places you can get “sample problems” (in alphabetical order):
- The British Informatics Olympiad
- The Dictionary of Algorithms and Data Structures
- The University of Valladolid Programming Contest Site
How do I read a string from input?
You can read a single, whitespace terminated word like this:
#include<iostream>
#include<string>
using namespace std;
int main()
{
cout << "Please enter a word:\n";
string s;
cin>>s;
cout << "You entered " << s << '\n';
}
Note that there is no explicit memory management and no fixed-sized buffer that you could possibly overflow.
If you really need a whole line (and not just a single word) you can do this:
#include<iostream>
#include<string>
using namespace std;
int main()
{
cout << "Please enter a line:\n";
string s;
getline(cin,s);
cout << "You entered " << s << '\n';
}
For a brief introduction to standard library facilities, such as iostream
and string
, see Chapter 3 of TC++PL3 (available online). For a detailed comparison of simple uses of C and C++ I/O, see “Learning Standard C++ as a New Language”, which you can download from Stroustrup’s publications list.
How do I write this very simple program?
Often, especially at the start of semesters, there is a small flood of questions about how to write very simple programs. Typically, the problem to be solved is to read in a few numbers, do something with them, and write out an answer. Here is a sample program that does that:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
vector<double> v;
double d;
while(cin>>d) v.push_back(d); // read elements
if (!cin.eof()) { // check if input failed
cerr << "format error\n";
return 1; // error return
}
cout << "read " << v.size() << " elements\n";
reverse(v.begin(),v.end());
cout << "elements in reverse order:\n";
for (int i = 0; i<v.size(); ++i) cout << v[i] << '\n';
return 0; // success return
}
Here are a few observations about this program:
- This is a Standard ISO C++ program using the standard library. Standard library facilities are declared in namespace std in headers without a .h suffix.
- If you want to compile this on a Windows machine, you need to compile it as a “console application”. Remember to give your source file the .cpp suffix or the compiler might think that it is C (not C++) source.
- Yes,
main()
returns anint
. - Reading into a standard
vector
guarantees that you don’t overflow some arbitrary buffer. Reading into an array without making a “silly error” is beyond the ability of complete novices – by the time you get that right, you are no longer a complete novice. If you doubt this claim, read Stroustrup’s paper “Learning Standard C++ as a New Language”, which you can download here. - The
!cin.eof()
is a test of the stream’s format. Specifically, it tests whether the loop ended by finding end-of-file (if not, you didn’t get input of the expected type/format). For more information, look up “stream state” in your C++ textbook. - A
vector
knows its size, so I don’t have to count elements. - Yes, you could declare
i
to be avector<double>::size_type
rather than plainint
to quiet warnings from some hyper-suspicious compilers, but in this case,I consider that too pedantic and distracting. - This program contains no explicit memory management, and it does not leak memory. A
vector
keeps track of the memory it uses to store its elements. When avector
needs more memory for elements, it allocates more; when avector
goes out of scope, it frees that memory. Therefore, the user need not be concerned with the allocation and deallocation of memory forvector
elements. - For reading in strings, see How do I read a string from input?.
- The program ends reading input when it sees “end of file”. If you run the program from the keybord on a Unix machine “end of file” is Ctrl-D. If you are on a Windows machine that because of a bug doesn’t recognize an end-of-file character, you might prefer this slightly more complicated version of the program that terminates input with the word “end”:
#include <iostream>
#include <vector>
#include <algorithm>
#include <string>
using namespace std;
int main()
{
vector<double> v;
double d;
while(cin>>d) v.push_back(d); // read elements
if (!cin.eof()) { // check if input failed
cin.clear(); // clear error state
string s;
cin >> s; // look for terminator string
if (s != "end") {
cerr << "format error\n";
return 1; // error return
}
}
cout << "read " << v.size() << " elements\n";
reverse(v.begin(),v.end());
cout << "elements in reverse order:\n";
for (int i = 0; i<v.size(); ++i) cout << v[i] << '\n';
return 0; // success return
}
For more examples of how to use the standard library to do simple things simply, see Parts 3 and 4 of the Tour of C++.
How do I convert an integer to a string?
Call to_string
. This is new in C++11, widely available, and as of this writing widely not-noticed. :)
int i = 127;
string s = to_string(i);
How do I convert a string to an integer?
Call stoi
:
string s = "127";
int i = stoi(s);
The related functions stol
and strtoll
will convert a string
to a long
or a long long
, respectively.
Should I use void main()
or int main()
?
int main()
main()
must return int
. Some compilers accept void main()
, but that is non-standard and shouldn’t be used. Instead use int main()
. As to the specific return value, if you don’t know what else to return just say return 0;
The definition
void main() { /* ... */ }
is not and never has been C++, nor has it even been C. See the ISO C++ standard 3.6.1[2] or the ISO C standard 5.1.2.2.1. A conforming implementation accepts
int main() { /* ... */ }
and
int main(int argc, char* argv[]) { /* ... */ }
A conforming implementation may provide more versions of main()
, but they must all have return type int
. The int
returned by main()
is a way for a program to return a value to “the system” that invokes it. On systems that doesn’t provide such a facility the return value is ignored, but that doesn’t make void main()
legal C++ or legal C. Even if your compiler accepts void main()
, avoid it, or risk being considered ignorant by C and C++ programmers.
In C++, main()
need not contain an explicit return
statement. In that case, the value returned is 0
, meaning successful execution. For example:
#include<iostream>
int main()
{
std::cout << "This program returns the integer value 0\n";
}
Note also that neither ISO C++ nor C99 allows you to leave the type out of a declaration. That is, in contrast to C89 and ARM C++, int
is not assumed where a type is missing in a declaration. Consequently:
#include<iostream>
main() { /* ... */ }
is an error because the return type of main()
is missing.
Should I use f(void)
or f()
?
f()
C programmers often use f(void)
when declaring a function that takes no parameters, however in C++ that is
considered bad style. In fact, the f(void)
style has been called an
“abomination” by Bjarne Stroustrup, the creator of C++, Dennis
Ritchie, the co-creator of C, and Doug McIlroy, head of the research department where Unix was born.
If you’re writing C++ code, you should use f()
. The f(void)
style is legal in C++, but only to make it easier to
compile C code.
This C++ code shows the best way to declare a function that takes no parameters:
void f(); // declares (not defines) a function that takes no parameters
This C++ code both declares and defines a function that takes no parameters:
void f() // declares and defines a function that takes no parameters
{
// ...
}
The following C++ code also declares a function that takes no parameters, but it uses the less desirable (some would
say “abomination”) style, f(void)
:
void f(void); // undesirable style for C++; use void f() instead
Actually this f()
thing is all you need to know about C++. That and using those new fangled //
comments. Once you
know those two things, you can claim to be a C++ expert. Go for it: type those magical “++” marks on your resumé.
Who cares about all that OO stuff — why should you bother changing the way you think? After all, the really
important thing isn’t thinking; it’s typing in function declarations and comments. (Sigh; I wish nobody actually thought
that way.)
What are the criteria for choosing between short
/ int
/ long
data types?
Other related questions: If a short int
is the same size as an int
on my particular implementation, why choose one
or the other? If I start taking the actual size in bytes of the variables into account, won’t I be making my code
unportable (since the size in bytes may differ from implementation to implementation)? Or should I simply go with sizes
much larger than I actually need, as a sort of safety buffer?
Answer: It’s usually a good idea to write code that can be ported to a different operating system and/or compiler.
After all, if you’re successful at what you do, someone else might want to use it somewhere else. This can be a little
tricky with built-in types like int
and short
, since C++ doesn’t give guaranteed sizes. However C++ gives you
two things that might help: guaranteed minimum sizes, and that will usually be all you need to know, and a standard C
header that provides typedefs for sized integers.
C++ guarantees a char
is exactly one byte which is at least 8 bits, short
is at least 16 bits,
int
is at least 16 bits, and long
is at least 32 bits. It also guarantees the unsigned
version of each of these is
the same size as the original, for example, sizeof(unsigned short) == sizeof(short)
.
When writing portable code, you shouldn’t make additional assumptions about these sizes. For example, don’t assume int
has 32 bits. If you have an integral variable that needs at least 32 bits, use a long
or unsigned long
even if
sizeof(int) == 4
on your particular implementation. On the other hand, if you have an integral variable quantity that
will always fit within 16 bits and if you want to minimize the use of data memory, use a short
or unsigned short
even if you know sizeof(int) == 2
on your particular implementation.
The other option is to use the following standard C header (which may or may not be provided by your C++ compiler vendor):
#include <stdint.h> /* not part of the C++ standard */
That header defines typedefs for things like int32_t
and uint16_t
, which are a signed 32-bit integer and an
unsigned 16-bit integer, respectively. There are other goodies in there, as well. My recommendation is that you use
these “sized” integral types only where they are actually needed. Some people worship consistency, and they are sorely
tempted to use these sized integers everywhere simply because they were needed somewhere. Consistency is good, but
it is not the greatest good, and using these typedefs everywhere can cause some headaches and even possible performance
issues. Better to use common sense, which often leads you to use the normal keywords, e.g., int
, unsigned
, etc.
where you can, and use of the explicitly sized integer types, e.g., int32_t
, etc. where you must.
Note that there are some subtle tradeoffs here. In some cases, your computer might be able to manipulate smaller things
faster than bigger things, but in other cases it is exactly the opposite: int
arithmetic might be faster than short
arithmetic on some implementations. Another tradeoff is data-space against code-space: int
arithmetic might
generate less binary code than short
arithmetic on some implementations. Don’t make simplistic assumptions. Just
because a particular variable can be declared as short
doesn’t necessarily mean it should, even if you’re trying
to save space.
Note that the C standard doesn’t guarantee that <stdint.h>
defines int
n_t
and uint
n_t
specifically for
n = 8, 16, 32 or 64. However if the underlying implementation provides integers with any of those sizes, <stdint.h>
is required to contain the corresponding typedefs. Furthermore you are guaranteed to have typedefs for sizes n = 8, 16
and 32 if your implementation is POSIX compliant. Put all that together and it’s fair to say that the vast majority of
implementations, though not all implementations, will have typedefs for those typical sizes.
What the heck is a const
variable? Isn’t that a contradiction in terms?
If it bothers you, call it a “const
identifier” instead.
The main issue is to figure out what it is; we can figure out what to call it later. For example, consider the symbol
max
in the following function:
void f()
{
const int max = 107;
// ...
float array[max];
// ...
}
It doesn’t matter whether you call max
a const
variable or a const
identifier. What matters is that you realize it
is like a normal variable in some ways (e.g., you can take its address or pass it by const-reference), but it is unlike
a normal variable in that you can’t change its value.
Here is another even more common example:
class Fred {
public:
// ...
private:
static const int max_ = 107;
// ...
};
In this example, you would need to add the line int Fred::max_;
in exactly one .cpp file, typically in Fred.cpp
.
It is generally considered good programming practice to give each “magic number” (like 107) a symbolic name and use that name rather than the raw magic number.
Why would I use a const
variable / const
identifier as opposed to #define
?
const
identifiers are often better than #define
because:
- they obey the language’s scoping rules
- you can see them in the debugger
- you can take their address if you need to
- you can pass them by
const
-reference if you need to - they don’t create new “keywords” in your program.
In short, const
identifiers act like they’re part of the language because they are part of the language. The
preprocessor can be thought of as a language layered on top of C++. You can imagine that the preprocessor runs as a
separate pass through your code, which would mean your original source code would be seen only by the preprocessor, not
by the C++ compiler itself. In other words, you can imagine the preprocessor sees your original source code and
replaces all #define
symbols with their values, then the C++ compiler proper sees the modified source code after
the original symbols got replaced by the preprocessor.
There are cases where #define
is needed, but you should generally avoid it when you have the choice. You should
evaluate whether to use const
vs. #define
based on business value: time, money, risk. In other words, one size
does not fit all. Most of the time you’ll use const
rather than #define
for constants, but sometimes you’ll use
#define
. But please remember to wash your hands afterwards.
Are you saying that the preprocessor is evil?
Yes, that’s exactly what I’m saying: the preprocessor is evil.
Every #define
macro effectively creates a new keyword in every source file and every scope until that symbol is
#undef
d. The preprocessor lets you create a #define
symbol that is always replaced independent of the {...}
scope where that symbol appears.
Sometimes we need the preprocessor, such as the #ifndef
/#define
wrapper within each
header file, but it should be avoided when you can. “Evil” doesn’t mean “never use.” You will use evil
things sometimes, particularly when they are “the lesser of two evils.” But they’re still evil :-)
What is the “standard library”? What is included / excluded from it?
Most (not all) implementations have a “standard include” directory, sometimes directories plural. If your implementation
is like that, the headers in the standard library are probably a subset of the files in those directories. For example,
iostream
and string
are part of the standard library, as is cstring
and cstdio
. There are a bunch of .h files
that are also part of the standard library, but not every .h file in those directories is part of the standard library.
For example, stdio.h
is but windows.h
is not.
You include headers from the standard library like this:
#include <iostream>
int main()
{
std::cout << "Hello world!\n";
// ...
}
How should I lay out my code? When should I use spaces, tabs, and/or newlines in my code?
The short answer is: Just like the rest of your team. In other words, the team should use a consistent approach to whitespace, but otherwise please don’t waste a lot of time worrying about it.
Here are a few details:
There is no universally accepted coding standard when it comes to whitespace. There are a few popular whitespace standards, such as the “one true brace” style, but there is a lot of contention over certain aspects of any given coding standard.
Most whitespace standards agree on a few points, such as putting a space around infix operators like x * y
or a - b
.
Most (not all) whitespace standards do not put spaces around the [
or ]
in a[i]
, and similar comments for (
and
)
in f(x)
. However there is a great deal of contention over vertical whitespace, particularly when it comes to {
and }
. For example, here are a few of the many ways to lay out if (foo()) { bar(); baz(); }
:
if (foo()) {
bar();
baz();
}
if (foo())
{
bar();
baz();
}
if (foo())
{
bar();
baz();
}
if (foo())
{
bar();
baz();
}
if (foo()) {
bar();
baz();
}
…and others…
IMPORTANT: Do NOT email me with reasons your whitespace approach is better than the others. I don’t care. Plus I won’t believe you. There is no objective standard of “better” when it comes to whitespace so your opinion is just that: your opinion. If you write me an email in spite of this paragraph, I will consider you to be a hopeless geek who focuses on nits. Don’t waste your time worrying about whitespace: as long as your team uses a consistent whitespace style, get on with your life and worry about more important things.
For example, things you should be worried about include design issues like when ABCs should be used,
whether inheritance should be an implementation or specification technique, what testing and inspection strategies
should be used, whether interfaces should uniformly have a get()
and/or set()
member function for each data member,
whether interfaces should be designed from the outside-in or the inside-out, whether errors be handled by
try
/catch
/throw
or by return codes, etc. Read the FAQ for some opinions on those important questions, but please
don’t waste your time arguing over whitespace. As long as the team is using a consistent whitespace strategy, drop it.
Is it okay if a lot of numbers appear in my code?
Probably not.
In many (not all) cases, it’s best to name your numbers so each number appears only once in your code. That way, when the number changes there will only be one place in the code that has to change.
For example, suppose your program is working with shipping crates. The weight of an empty crate is 5.7
. The expression
5.7 + contentsWeight
probably means the weight of the crate including its contents, meaning the number 5.7
probably
appears many times in the software. All these occurrences of the number 5.7
will be difficult to find and change when
(not if) somebody changes the style of crates used in this application. The solution is to make sure the value 5.7
appears exactly once, usually as the initializer for a const
identifier. Typically this will be something like
const double crateWeight = 5.7;
. After that, 5.7 + contentsWeight
would be replaced by
crateWeight + contentsWeight
.
Now that’s the general rule of thumb. But unfortunately there is some fine print.
Some people believe one should never have numeric literals scattered in the code. They believe all numeric values should be named in a manner similar to that described above. That rule, however noble in intent, just doesn’t work very well in practice. It is too tedious for people to follow, and ultimately it costs companies more than it saves them. Remember: the goal of all programming rules is to reduce time, cost and risk. If a rule actually makes things worse, it is a bad rule, period.
A more practical rule is to focus on those values that are likely to change. For example, if a numeric literal is likely
to change, it should appear only once in the software, usually as the initializer of a const
identifier. This rule
lets unchanging values, such as some occurrences of 0, 1, -1, etc., get coded directly in the software so programmers
don’t have to search for the one true definition of one
or zero
. In other words, if a programmer wants to loop over
the indices of a vector
, he can simply write for (int i = 0; i < v.size(); ++i)
. The “extremist” rule described
earlier would require the programmer to poke around asking if anybody else has defined a const
identifier initialized
to 0, and if not, to define his own const int zero = 0;
then replace the loop with
for (int i = zero; i < v.size(); ++i)
. This is all a waste of time since the loop will always start with 0. It adds
cost without adding any value to compensate for that cost.
Obviously people might argue over exactly which values are “likely to change,” but that kind of judgment is why you get paid the big bucks: do your job and make a decision. Some people are so afraid of making a wrong decision that they’ll adopt a one-size-fits-all rule such as “give a name to every number.” But if you adopt rules like that, you’re guaranteed to have made the wrong decision: those rules cost your company more than they save. They are bad rules.
The choice is simple: use a flexible rule even though you might make a wrong decision, or use a one-size-fits-all rule and be guaranteed to make a wrong decision.
There is one more piece of fine print: where the const
identifier should be defined. There are three typical
cases:
- If the
const
identifier is used only within a single function, it can be local to that function. - If the
const
identifier is used throughout a class and no where else, it can bestatic
within theprivate
part of that class. - If the
const
identifier is used in numerous classes, it can bestatic
within thepublic
part of the most appropriate class, or perhapsprivate
in that class with apublic
static
access method.
As a last resort, make it static
within a namespace or perhaps put it in the unnamed namespace. Try very hard to avoid
using #define
since the preprocessor is evil. If you need to use #define
anyway, wash your hands when you’re done. And please ask some friends if they know of a
better alternative.
(As used throughout the FAQ, “evil” doesn’t mean “never use it.” There are times when you will use something that is “evil” since it will be, in those particular cases, the lesser of two evils.)
What’s the point of the L
, U
and f
suffixes on numeric literals?
You should use these suffixes when you need to force the compiler to treat the numeric literal as if it were the
specified type. For example, if x
is of type float
, the expression x + 5.7
is of type double: it first promotes
the value of x
to a double
, then performs the arithmetic using double-precision instructions. If that is what you
want, fine; but if you really wanted it to do the arithmetic using single-precision instructions, you can change that
code to x + 5.7f
. Note: it is even better to “name” your numeric literals, particularly those that are likely to
change. That would require you to say x + crateWeight
where crateWeight
is a const
float
that is initialized to 5.7f
.
The U
suffix is similar. It’s probably a good idea to use unsigned integers for variables that are always >= 0. For
example, if a variable represents an index into an array, that variable would typically be declared as an unsigned
.
The main reason for this is it requires less code, at least if you are careful to check your ranges. For example, to
check if a variable is both >= 0 and < max requires two tests if everything is signed: if (n >= 0 && n < max)
,
but can be done with a single comparison if everything is unsigned: if (n < max)
.
If you end up using unsigned variables, it is generally a good idea to force your numeric literals to also be unsigned.
That makes it easier to see that the compiler will generate “unsigned arithmetic” instructions. For example:
if (n < 256U)
or if ((n & 255u) < 32u)
. Mixing signed and unsigned values in a single arithmetic expression is often
confusing for programmers — the compiler doesn’t always do what you expect it should do.
The L
suffix is not as common, but it is occasionally used for similar reasons as above: to make it obvious that the
compiler is using long
arithmetic.
The bottom line is this: it is a good discipline for programmers to force all numeric operands to be of the right type,
as opposed to relying on the C++ rules for promoting/demoting numeric expressions. For example, if x
is of type
int
and y
is of type unsigned
, it is a good idea to change x + y
so the next programmer knows whether you
intended to use unsigned arithmetic, e.g., unsigned(x) + y
, or signed arithmetic: x + int(y)
. The other possibility
is long arithmetic: long(x) + long(y)
. By using those casts, the code is more explicit and that’s good in this case,
since a lot of programmers don’t know all the rules for implicit promotions.
I can understand the and (&&
) and or (||
) operators, but what’s the purpose of the not (!
) operator?
Some people are confused about the !
operator. For example, they think that !true
is the same as false
, or that
!(a < b)
is the same as a >= b
, so in both cases the !
operator doesn’t seem to add
anything.
Answer: The !
operator is useful in boolean expressions, such occur in an if
or while
statement. For example,
let’s assume A and B are boolean expressions, perhaps simple method-calls that return a bool
. There are all sorts
of ways to combine these two expressions:
if ( A && B) /*...*/ ;
if (!A && B) /*...*/ ;
if ( A && !B) /*...*/ ;
if (!A && !B) /*...*/ ;
if (!( A && B)) /*...*/ ;
if (!(!A && B)) /*...*/ ;
if (!( A && !B)) /*...*/ ;
if (!(!A && !B)) /*...*/ ;
Along with a similar group formed using the ||
operator.
Note: boolean algebra can be used to transform each of the &&
-versions into an equivalent ||
-version, so from a
truth-table standpoint there are only 8 logically distinct if
statements. However, since readability is so important
in software, programmers should consider both the &&
-version and the logically equivalent ||
-version. For example,
programmers should choose between !A && !B
and !(A || B)
based on which one is more obvious to whoever will be
maintaining the code. In that sense there really are 16 different choices.
The point of all this is simple: the !
operator is quite useful in boolean expressions. Sometimes it is used for
readability, and sometimes it is used because expressions like !(a < b)
actually are not
equivalent to a >= b
in spite of what your grade school math teacher told you.
Is !(a < b)
logically the same as a >= b
?
No!
Despite what your grade school math teacher taught you, these equivalences don’t always work in software, especially with floating point expressions or user-defined types.
Example: if a
is a floating point NaN, then both a < b
and a >= b
will be false. That means !(a < b)
will be true and a >= b
will be false.
Example: if a
is an object of class Foo
that has overloaded operator<
and operator>=
, then it is up to the
creator of class Foo
if these operators will have opposite semantics. They probably should have opposite semantics,
but that’s up to whoever wrote class Foo
.
What is this NaN thing?
NaN means “not a number,” and is used for floating point operations.
There are lots of floating point operations that don’t make sense, such as dividing by zero, taking the log of zero or a negative number, taking the square root of a negative number, etc. Depending on your compiler, some of these operations may produce special floating point values such as infinity (with distinct values for positive vs. negative infinity) and the not a number value, NaN.
If your compiler produces a NaN, it has the unusual property that it is not equal to any value, including itself. For
example, if a
is NaN, then a == a
is false. In fact, if a
is NaN, then a
will be neither less than, equal to,
nor greater than any value including itself. In other words, regardless of the value of b
, a < b
, a <= b
, a > b
,
a >= b
, and a == b
will all return false.
Here’s how to check if a value is NaN:
#include <cmath>
void funct(double x)
{
if (isnan(x)) { // Though see caveat below
// x is NaN
// ...
} else {
// x is a normal value
// ...
}
}
Note: although isnan()
is part of the latest C standard library, your C++ compiler vendor might not supply it. For
example, Microsoft Visual C++.NET does not supply isnan()
(though it does supply _isnan()
defined in <float.h>
).
If your vendor does not supply any variant of isnan()
, define this function:
inline bool my_isnan(double x)
{
return x != x;
}
In any case, DO NOT WRITE ME just to say that your compiler does/does not support isnan()
.
Why is floating point so inaccurate? Why doesn’t this print 0.43?
#include <iostream>
int main()
{
float a = 1000.43;
float b = 1000.0;
std::cout << a - b << '\n';
// ...
}
(On one C++ implementation, this prints 0.429993)
Disclaimer: Frustration with rounding/truncation/approximation isn’t really a C++ issue; it’s a computer science
issue. However, people keep asking about it on comp.lang.c++
, so what follows is a nominal
answer.
Answer: Floating point is an approximation. The IEEE standard for 32 bit float supports 1 bit of sign, 8 bits of
exponent, and 23 bits of mantissa. Since a normalized binary-point mantissa always has the form 1.xxxxx… the leading
1 is dropped and you get effectively 24 bits of mantissa. The number 1000.43 (and many, many others, including some
really common ones like 0.1) is not exactly representable in float or double format. 1000.43 is actually represented as
the following bitpattern (the “s
” shows the position of the sign bit, the “e
“s show the positions of the exponent
bits, and the “m
“s show the positions of the mantissa bits):
seeeeeeeemmmmmmmmmmmmmmmmmmmmmmm
01000100011110100001101110000101
The shifted mantissa is 1111101000.01101110000101 or 1000 + 7045/16384. The fractional part is 0.429992675781. With 24
bits of mantissa you only get about 1 part in 16M of precision for float. The double
type provides more precision (53
bits of mantissa).
Why doesn’t my floating-point comparison work?
Because floating point arithmetic is different from real number arithmetic.
Bottom line: Never use ==
to compare two floating point numbers.
Here’s a simple example:
double x = 1.0 / 10.0;
double y = x * 10.0;
if (y != 1.0)
std::cout << "surprise: " << y << " != 1\n";
The above “surprise” message will appear on some (but not all) compilers/machines. But even if your particular compiler/machine doesn’t cause the above “surprise” message (and if you write me telling me whether it does, you’ll show you’ve missed the whole point of this FAQ), floating point will surprise you at some point. So read this FAQ and you’ll know what to do.
The reason floating point will surprise you is that float
and double
values are normally represented using a finite
precision binary format. In other words, floating point numbers are not real numbers. For example, in your machine’s
floating point format it might be impossible to exactly represent the number 0.1. By way of analogy, it’s impossible to
exactly represent the number one third in decimal format (unless you use an infinite number of digits).
To dig a little deeper, let’s examine what the decimal number 0.625 means. This number has a 6 in the “tenths” place, a 2 in the “hundreths” place, and a 5 in the “thousanths” place. In other words, we have a digit for each power of 10. But in binary, we might, depending on the details of your machine’s floating point format, have a bit for each power of 2. So the fractional part might have a “halves” place, a “quarters” place, an “eighths” place, “sixteenths” place, etc., and each of these places has a bit.
Let’s pretend your machine represents the fractional part of floating point numbers using the above scheme (it’s normally more complicated than that, but if you already know exactly how floating point numbers are stored, chances are you don’t need this FAQ to begin with, so look at this as a good starting point). On that pretend machine, the bits of the fractional part of 0.625 would be 101: 1 in the ½-place, 0 in the ¼-place, and 1 in the ⅛-place. In other words, 0.625 is ½ + ⅛.
But on this pretend machine, 0.1 cannot be represented exactly since it cannot be formed as a sum of a finite number of powers of 2. You can get close but you can’t represent it exactly. In particular you’d have a 0 in the ½-place, a 0 in the ¼-place, a 0 in the ⅛-place, and finally a 1 in the “sixteenths” place, leaving a remainder of 1/10 - 1/16 = 3/80. Figuring out the other bits is left as an exercise (hint: look for a repeating bit-pattern, analogous to trying to represent 1/3 or 1/7 in decimal format).
The message is that some floating point numbers cannot always be represented exactly, so comparisons don’t always do what you’d like them to do. In other words, if the computer actually multiplies 10.0 by 1.0/10.0, it might not exactly get 1.0 back.
That’s the problem. Now here’s the solution: be very careful when comparing floating point numbers for equality (or
when doing other things with floating point numbers; e.g., finding the average of two floating point numbers seems
simple but to do it right requires an if
/else
with at least three cases).
Here’s the wrong way to do it:
void dubious(double x, double y)
{
// ...
if (x == y) // Dubious!
foo();
// ...
}
If what you really want is to make sure they’re “very close” to each other (e.g., if variable a
contains the value
1.0 / 10.0
and you want to see if (10*a == 1)
), you’ll probably want to do something fancier than the above:
void smarter(double x, double y)
{
// ...
if (isEqual(x, y)) // Smarter!
foo();
// ...
}
There are many ways to define the isEqual()
function, including:
#include <cmath> /* for std::abs(double) */
inline bool isEqual(double x, double y)
{
const double epsilon = /* some small number such as 1e-5 */;
return std::abs(x - y) <= epsilon * std::abs(x);
// see Knuth section 4.2.2 pages 217-218
}
Note: the above solution is not completely symmetric, meaning it is possible for isEqual(x,y)
!=
isEqual(y,x)
.
From a practical standpoint, does not usually occur when the magnitudes of x
and y
are significantly larger than
epsilon
, but your mileage may vary.
For other useful functions, check out the following (listed alphabetically):
- Isaacson, E. and Keller, H., Analysis of Numerical Methods, Dover.
- Kahan, W.,
http.cs.berkeley.edu/~wkahan/
. - Knuth, Donald E., The Art of Computer Programming, Volume II: Seminumerical Algorithms, Addison-Wesley, 1969.
- LAPACK: Linear Algebra Subroutine Library,
www.siam.org
- NETLIB: the collected algorithms from ACM Transactions on Mathematical Software, which have all been refereed, plus
a great many other algorithms that have withstood somewhat less formal scrutiny from peers,
www.netlib.org
- Numerical Recipes, by Press et al. Although note some negative reviews, such as
amath.colorado.edu/computing/Fortran/numrec.html
- Ralston and Rabinowitz, A First Course in Numerical Analysis: Second Edition, Dover.
- Stoer, J. and Bulirsch, R., Introduction to Numerical Analysis, Springer Verlag, in German.
Double-check your assumptions, including “obvious” things like how to compute averages, how to solve quadratic equations, etc., etc. Do not assume the formulas you learned in High School will work with floating point numbers!
For insights on the underlying ideas and issues of floating point computation, start with David Goldberg’s paper, What Every Computer-Scientist Should Know About Floating Point Arithmetic or here in PDF format. You might also want to read this supplement by Doug Priest. The combined paper + supplement is also available. You might also want to go here for links to other floating-point topics.
Why is cos(x) != cos(y)
even though x == y
? (Or sine or tangent or log or just about any other floating point computation)
I know it’s hard to accept, but floating point arithmetic simply does not work like most people expect. Worse, some of the differences are dependent on the details of your particular computer’s floating point hardware and/or the optimization settings you use on your particular compiler. You might not like that, but it’s the way it is. The only way to “get it” is to set aside your assumptions about how things ought to behave and accept things as they actually do behave.
Let’s work a simple example. Turns out that on some installations, cos(x) != cos(y)
even though x == y
. That’s not a
typo; read it again if you’re not shocked: the cosine of something can be unequal to the cosine of the same thing. (Or
the sine, or the tangent, or the log, or just about any other floating point computation.)
#include <iostream>
#include <cmath>
void foo(double x, double y)
{
if (std::cos(x) != std::cos(y)) {
std::cout << "Huh?!?\n"; // You might end up here when x == y!!
}
}
int main()
{
foo(1.0, 1.0);
return 0;
}
On many (not all) computers, you will end up in the if
block even when x == y
. If that doesn’t shock you, you’re
asleep; read it again. If you want, try it on your particular computer. Some of you will end up in the if
block, some
will not, and for some it will depend on the details of your particular compiler or options or hardware or the phase of
the moon.
Why, you ask, can that happen? Good question; thanks for asking. Here’s the answer (with emphasis on the word
“often”; the behavior depends on your hardware, compiler, etc.): floating point calculations and comparisons are often
performed by special hardware that often contain special registers, and those registers often have more bits than a
double
. That means that intermediate floating point computations often have more bits than sizeof(double)
, and when
a floating point value is written to RAM, it often gets truncated, often losing some bits of precision.
Said another way, intermediate calculations are often more precise (have more bits) than when those same values get
stored into RAM. Think of it this way: storing a floating point result into RAM requires some bits to get discarded, so
comparing a (truncated) value in RAM with an (untruncated) value within a floating-point register might not do what you
expect. Suppose your code computes cos(x)
, then truncates that result and stores it into a temporary variable, say
tmp
. It might then compute cos(y)
, and (drum roll please) compare the untruncated result of cos(y)
with tmp
,
that is, with the truncated result of cos(x)
. Expressed in an imaginary assembly language, the expression
cos(x) != cos(y)
might get compiled into this:
// Imaginary assembly language
fp_load x // load a floating-point register with the value of parameter x
call _cos // call cos(double), using the floating point register for param and result
fp_store tmp // truncate the floating-point result and store into temporary local var, tmp
fp_load y // load a floating-point register with the value of parameter y
call _cos // call cos(double), using the floating point register for param ans result
fp_cmp tmp // compare the untruncated result (in the register) with the truncated value in tmp
// ...
Did you catch that? Your particular installation might store the result of one of the cos()
calls out into RAM,
truncating it in the process, then later compare that truncated value with the untruncated result of the second
cos()
call. Depending on lots of details, those two values might not be equal.
It gets worse; better sit down. Turns out that the behavior can depend on how many instructions are between the
cos()
calls and the !=
comparison. In other words, if you put cos(x)
and cos(y)
into locals, then later compare
those variables, the result of the comparison can depend on exactly what, if anything, your code does after storing the
results into locals and comparing the variables. Gulp.
void foo(double x, double y)
{
double cos_x = cos(x);
double cos_y = cos(y);
// the behavior might depend on what's in here
if (cos_x != cos_y) {
std::cout << "Huh?!?\n"; // You might end up here when x == y!!
}
}
Your mouth should be hanging open by now. If not, you either learned pretty quickly from the above or you are still
asleep. Read it again. When x == y
, you can still end up in the if
block depending on, among other things, how much
code is in the ...
line. Wow.
Reason: if the compiler can prove that you’re not messing with any floating point registers in the ...
line, it
might not actually store cos(y)
into cos_y
, instead leaving it in the register and comparing the untruncated
register with the truncated variable cos_x
. In this case, you might end up in the if
block. But if you call a
function between the two lines, such as printing one or both variables, or if you do something else that messes with the
floating point registers, the compiler will (might) need to store the result of cos(y)
into variable cos_y
, after
which it will be comparing two truncated values. In that case you won’t end up in the if
block.
If you didn’t hear anything else in this whole discussion, just remember this: floating point comparisons are tricky and subtle and fraught with danger. Be careful. The way floating point actually works is different from the way most programmers tend to think it ought to work. If you intend to use floating point, you need to learn how it actually works.
What is the type of an enumeration such as enum Color
? Is it of type int
?
An enumeration such as enum Color { red, white, blue };
is its own type. It is not of type int
.
When you create an object of an enumeration type, e.g., Color x;
, we say that the object x
is of type Color
.
Object x
isn’t of type “enumeration,” and it’s not of type int
.
An expression of an enumeration type can be converted to a temporary int
. An analogy may help here. An expression of
type float
can be converted to a temporary double
, but that doesn’t mean float
is a subtype of double
. For
example, after the declaration float y;
, we say that y
is of type float
, and the expression y
can be converted
to a temporary double
. When that happens, a brand new, temporary double
is created by copying something out of y
.
In the same way, a Color
object such as x
can be converted to a temporary int
, in which case a brand new,
temporary int
is created by copying something out of x
. (Note: the only purpose of the float
/ double
analogy
in this paragraph is to help explain how expressions of an enumeration type can be converted to temporary int
s; do
not try to use that analogy to imply any other behavior!)
The above conversion is very different from a subtype relationship, such as the relationship between derived class Car
and its base class Vehicle
. For example, an object of class Car
, such as Car z;
, actually is an object of class
Vehicle
, therefore you can bind a Vehicle&
to that object, e.g., Vehicle& v = z;
. Unlike the previous paragraph,
the object z
is not copied to a temporary; reference v
binds to z
itself. So we say an object of class Car
is
a Vehicle
, but an object of class “Color” simply can be copied/converted into a temporary int
. Big difference.
Final note, especially for C programmers: the C++ compiler will not automatically convert an int
expression to a
temporary Color
. Since that sort of conversion is unsafe, it requires a cast, e.g.,
Color x = Color(2);
. But be sure your integer is a valid enumeration value. If you go provide an illegal value, you
might end up with something other than what you expect. The compiler doesn’t do the check for you; you must do it
yourself.
If an enumeration type is distinct from any other type, what good is it? What can you do with it?
Let’s consider this enumeration type: enum Color { red, white, blue };
.
The best way to look at this (C programmers: hang on to your seats!!) is that the values of this type are red
,
white
, and blue
, as opposed to merely thinking of those names as constant int
values. The C++ compiler provides
an automatic conversion from Color
to int
, and the converted values will be, in this case, 0, 1, and 2 respectively.
But you shouldn’t think of blue
as a fancy name for 2. blue
is of type Color
and there is an automatic conversion
from blue
to 2, but the inverse conversion, from int
to Color
, is not provided automatically by the C++
compiler.
Here is an example that illustrates the conversion from Color
to int
:
enum Color { red, white, blue };
void f()
{
int n;
n = red; // change n to 0
n = white; // change n to 1
n = blue; // change n to 2
}
The following example also demonstrates the conversion from Color
to int
:
void f()
{
Color x = red;
Color y = white;
Color z = blue;
int n;
n = x; // change n to 0
n = y; // change n to 1
n = z; // change n to 2
}
However the inverse conversion, from int
to Color
, is not automatically provided by the C++ compiler:
void f()
{
Color x;
x = blue; // change x to blue
x = 2; // compile-time error: can't convert int to Color
}
The last line above shows that enumeration types are not int
s in disguise. You can think of them as int
types if
you want to, but if you do, you must remember that the C++ compiler will not implicitly convert an int
to a
Color
. If you really want that, you can use a cast:
void f()
{
Color x;
x = red; // change x to red
x = Color(1); // change x to white
x = Color(2); // change x to blue
x = 2; // compile-time error: can't convert int to Color
}
There are other ways that enumeration types are unlike int
. For example, enumeration types don’t have a ++
operator:
void f()
{
int n = red; // change n to 0
Color x = red; // change x to red
n++; // change n to 1
x++; // compile-time error: can't ++ an enumeration (though see caveat below)
}
Caveat on the last line: it is legal to provide an overloaded operator that would make that line legal, such as
definining operator++(Color& x)
.
What other “newbie” guides are there for me?
An excellent place to start is this site’s Get Started page.