This is basically due to function call overhead and indirection. The ofstream::write() method is inherited from ostream. That function is not inlined in libstdc++, which is the first source of overhead. Then ostream::write() has to call rdbuf()->sputn() to do the actual writing, which is a virtual function call.
On top of that, libstdc++ redirects sputn() to another virtual function xsputn() which adds another virtual function call.
If you put the characters into the buffer yourself, you can avoid that overhead.