Yes, I'm seeing clang++ slightly faster too, but only for the limited length fixed string case.
Same here.
I also fiddled with some of the many compiler parameters but felt overwhelmed by the sheer number and complexity of them, so just stuck to the basic ones for now.
I tried various parameters not realizing no improvements. The following performs similarly, without the extra CFLAGS.
$ clang++ -o llil2vec -std=c++11 -Wall -O3 llil2vec.cpp
$ ./llil2vec big1.txt big2.txt big3.txt >out.txt
llil2vec (fixed string length=6) start
get_properties CPU time : 1.65488 secs
emplace set sort CPU time : 0.470765 secs
write stdout CPU time : 0.850356 secs
total CPU time : 2.97605 secs
total wall clock time : 3 secs