regex quotes character class

igoryonya has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: regex quotes character class by soonix (Chancellor) on Jun 17, 2020 at 06:27 UTC
in perluniprops there is `\p{QMark}` and `\p{Quotation_Mark}`, and I assume they span the characters listed in https://en.wikipedia.org/wiki/Quotation_mark#Unicode_code_point_table	[reply] [d/l] [select]
Re^2: regex quotes character class by ikegami (Patriarch) on Jun 17, 2020 at 14:26 UTC
`\p{Quotation_Mark=Yes}` and its alias `\p{QMark=Yes}`, which can be simplified to `\p{Quotation_Mark}` and `\p{QMark}`, do indeed track the similarly-named Unicode character property. Characters with this property include most of the the characters listed in the linked table and some of the characters listed by the OP.	[reply] [d/l] [select]
Re: regex quotes character class by kcott (Archbishop) on Jun 17, 2020 at 11:11 UTC
G'day igoryonya, "I think, I used to stumble upon a character class, or a property ... can't find any information about it now ... something like ... `[[:quote:]]` or `\p{Quote}` ..." I suspect you were looking in one or both of perlrecharclass or perlrebackslash; however, neither `[[:quote:]]` nor `\p{Quote}` exist (as far as I can tell). I see ++soonix has indicated `\p{QMark}` and `\p{Quotation_Mark}` in perluniprops. On a number of occasions in the past, I have had to identify Unicode properties (which could be for spaces, combining characters, specific scripts, and so on). I've tried the table in the "perluniprops: Properties accessible through \p{} and \P{}" section but find it to be a hard slog; for instance, a case-insensitive search for "quote" does find some matches but not the `\p{Quotation_Mark}` or `\p{QMark}` that you actually wanted. I have found the best tool to be the core module Unicode::UCD. This has many functions which can help you identify properties (as well as find a lot of other information). I've put together a script to showcase some of that module's functionality: I'd recommend at least skimming the documentation to get a feel for the other functions that are available. Much of the code in the script I'd probably just do from the command line; however, it seemed easier to lump it all together for the purposes of the current post. #!/usr/bin/env perl use 5.030; use warnings; use utf8; use open OUT => qw{:encoding(UTF-8) :std}; use Data::Dump; use Unicode::UCD qw{ charprops_all charprop prop_aliases }; say 'Find properties of interest for a quote:'; dd charprops_all(ord '"'); say '-' x 40; say 'Find aliases for "Quotation_Mark":'; my @aliases = prop_aliases('Quotation_Mark'); say join "\n", @aliases; say '-' x 40; no warnings 'qw'; my @chars = qw{" ' ` ~ X С Т В ‛ , У Ф ‟ Д л ╗ < >}; use warnings 'qw'; say 'Check UCD vs. regex properties:'; for my $prop (@aliases) { say '=' x 40; say "Property: $prop"; say '=' x 40; for my $char (@chars) { my $cp = sprintf 'U+%04x', ord $char; say "Character: $char"; say "Code point: $cp"; my $qmark_prop = charprop($cp, $prop); say "$prop: $qmark_prop"; my $re_prop = $char =~ /^\p{$prop}$/ ? 'Yes' : 'No'; say "Check regex: $re_prop"; say 'UCD & RE match: ', $qmark_prop eq $re_prop ? 'Yes' : '!!! No !!!'; say '-' x 40; } } [Aside: Yes, `<code>` tags are generally preferred but, when the code contains Unicode characters, `<pre>` tags stop those characters from being turned into entity references (e.g. `‛` in your OP). I've also bunched up the code somewhat as `<pre>` tags won't wrap the code around like you'd get with `<code>` tags.] Notes: Note the "`use 5.030;` at the start. Versions of Perl typically seem to keep up with Unicode versions (some are just one version behind). Perl v5.30 supports the current Unicode v12.1 (see "perl5300delta: Unicode 12.1 is supported"). Unicode has a "BETA Unicodeо 13.0.0" and the development Perl v5.31.9 supports that (see "perldelta (5.31.9): Unicode 13.0 (beta) is supported"). The list of characters in `@chars` is somewhat arbitrary. It includes a number of non-quotes for testing; there's also a comma which, at least to me, appears identical to '`В`' (`U+201A SINGLE LOW-9 QUOTATION MARK`). Here's an extract of the output. The full output, which is rather long, is in the spoiler below. Find properties of interest for a quote: { ... Quotation_Mark => "Yes", ... } ---------------------------------------- Find aliases for "Quotation_Mark": QMark Quotation_Mark ---------------------------------------- Check UCD vs. regex properties: ======================================== Property: QMark ======================================== Character: " Code point: U+0022 QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- ... ---------------------------------------- Character: X Code point: U+0058 QMark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: С Code point: U+2018 QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- ... ======================================== Property: Quotation_Mark ======================================== ... same as QMark ... Full output: Find properties of interest for a quote: { Age => "V1_1", Alphabetic => "No", ASCII_Hex_Digit => "No", Bidi_Class => "Other_Neutral", Bidi_Control => "No", Bidi_Mirrored => "No", Bidi_Mirroring_Glyph => "", Bidi_Paired_Bracket => "", Bidi_Paired_Bracket_Type => "None", Block => "Basic_Latin", Canonical_Combining_Class => "Not_Reordered", Case_Folding => "\"", Case_Ignorable => "No", Cased => "No", Changes_When_Casefolded => "No", Changes_When_Casemapped => "No", Changes_When_Lowercased => "No", Changes_When_NFKC_Casefolded => "No", Changes_When_Titlecased => "No", Changes_When_Uppercased => "No", Composition_Exclusion => "No", Dash => "No", Decomposition_Mapping => "\"", Decomposition_Type => "None", Default_Ignorable_Code_Point => "No", Deprecated => "No", Diacritic => "No", East_Asian_Width => "Narrow", Equivalent_Unified_Ideograph => "", Extender => "No", Full_Composition_Exclusion => "No", General_Category => "Other_Punctuation", Grapheme_Base => "Yes", Grapheme_Cluster_Break => "Other", Grapheme_Extend => "No", Hangul_Syllable_Type => "Not_Applicable", Hex_Digit => "No", Hyphen => "No", ID_Continue => "No", ID_Start => "No", Ideographic => "No", IDS_Binary_Operator => "No", IDS_Trinary_Operator => "No", Indic_Positional_Category => "NA", Indic_Syllabic_Category => "Other", ISO_Comment => "", Join_Control => "No", Joining_Group => "No_Joining_Group", Joining_Type => "Non_Joining", Line_Break => "Quotation", Logical_Order_Exception => "No", Lowercase => "No", Lowercase_Mapping => "\"", Math => "No", Name => "QUOTATION MARK", Name_Alias => "", NFC_Quick_Check => "Yes", NFD_Quick_Check => "Yes", NFKC_Casefold => "\"", NFKC_Quick_Check => "Yes", NFKD_Quick_Check => "Yes", Noncharacter_Code_Point => "No", Numeric_Type => "None", Numeric_Value => NaN, Pattern_Syntax => "Yes", Pattern_White_Space => "No", Prepended_Concatenation_Mark => "No", Present_In => 1.1, Quotation_Mark => "Yes", Radical => "No", Regional_Indicator => "No", Script => "Common", Script_Extensions => "Common", Sentence_Break => "Close", Sentence_Terminal => "No", Simple_Case_Folding => "\"", Simple_Lowercase_Mapping => "\"", Simple_Titlecase_Mapping => "\"", Simple_Uppercase_Mapping => "\"", Soft_Dotted => "No", Terminal_Punctuation => "No", Titlecase_Mapping => "\"", Unicode_1_Name => "", Unified_Ideograph => "No", Uppercase => "No", Uppercase_Mapping => "\"", Variation_Selector => "No", Vertical_Orientation => "Rotated", White_Space => "No", Word_Break => "Double_Quote", XID_Continue => "No", XID_Start => "No", } ---------------------------------------- Find aliases for "Quotation_Mark": QMark Quotation_Mark ---------------------------------------- Check UCD vs. regex properties: ======================================== Property: QMark ======================================== Character: " Code point: U+0022 QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ' Code point: U+0027 QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ` Code point: U+0060 QMark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: ~ Code point: U+007e QMark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: X Code point: U+0058 QMark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: С Code point: U+2018 QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: Т Code point: U+2019 QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: В Code point: U+201a QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ‛ Code point: U+201b QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: , Code point: U+002c QMark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: У Code point: U+201c QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: Ф Code point: U+201d QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ‟ Code point: U+201f QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: Д Code point: U+201e QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: л Code point: U+00ab QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ╗ Code point: U+00bb QMark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: < Code point: U+003c QMark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: > Code point: U+003e QMark: No Check regex: No UCD & RE match: Yes ---------------------------------------- ======================================== Property: Quotation_Mark ======================================== Character: " Code point: U+0022 Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ' Code point: U+0027 Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ` Code point: U+0060 Quotation_Mark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: ~ Code point: U+007e Quotation_Mark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: X Code point: U+0058 Quotation_Mark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: С Code point: U+2018 Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: Т Code point: U+2019 Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: В Code point: U+201a Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ‛ Code point: U+201b Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: , Code point: U+002c Quotation_Mark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: У Code point: U+201c Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: Ф Code point: U+201d Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ‟ Code point: U+201f Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: Д Code point: U+201e Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: л Code point: U+00ab Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: ╗ Code point: U+00bb Quotation_Mark: Yes Check regex: Yes UCD & RE match: Yes ---------------------------------------- Character: < Code point: U+003c Quotation_Mark: No Check regex: No UCD & RE match: Yes ---------------------------------------- Character: > Code point: U+003e Quotation_Mark: No Check regex: No UCD & RE match: Yes ---------------------------------------- — Ken	[reply] [d/l] [select]
Re: regex quotes character class by ikegami (Patriarch) on Jun 17, 2020 at 11:46 UTC
You could use the following: [\p{QMark}`\N{U+275B}-\N{U+275E}\N{U+1F676}-\N{U+1F678}\N{U+2826}\N{28 +34}] [download] What follows explains how this was derived. Let's start by collecting some info about each character. for q in \" \' л ╗ С Т ‛ У Ф ❝ ❞ 🙶 🙷 \`; do uniprops --all --single -- "$q" >"props-$q" done The programs `uniprops` and `unichars` (used later) are provided by Unicode::Tussle. Let's collect what we have. `perl -e' use 5.014; use warnings; my %props; while (<>) { chomp; ++$props{$_}; } say "$props{$_} $_" for sort { $props{$b} <=> $props{$a} \|\| $a cmp $b } keys(%props); ' props-` [download] The list is long, but a lot are redundant (aliases and short forms). [It would be nice if we could tell it to output just one form of equivalent forms!] 14 All 14 Any 14 Assigned 14 BC=ON 14 Bidi_Class=ON 14 Bidi_Class=Other_Neutral 14 Bidi_Paired_Bracket_Type=None 14 CCC=NR 14 Canonical_Combining_Class=0 14 Canonical_Combining_Class=NR 14 Canonical_Combining_Class=Not_Reordered 14 Common 14 DT=None 14 Decomposition_Type=None 14 GCB=XX 14 GrBase 14 Gr_Base 14 Graph 14 Grapheme_Base 14 Grapheme_Cluster_Break=Other 14 Grapheme_Cluster_Break=XX 14 HST=NA 14 Hangul_Syllable_Type=NA 14 Hangul_Syllable_Type=Not_Applicable 14 IN=10.0 14 IN=11.0 14 IN=12.0 14 IN=12.1 14 IN=7.0 14 IN=8.0 14 IN=9.0 14 InPC=NA 14 InSC=Other 14 Indic_Positional_Category=NA 14 Indic_Syllabic_Category=Other 14 JG=NoJoiningGroup 14 JT=U 14 Joining_Group=No_Joining_Group 14 Joining_Type=Non_Joining 14 Joining_Type=U 14 NT=None 14 NV=NaN 14 Numeric_Type=None 14 Numeric_Value=NaN 14 Present_In=10.0 14 Present_In=11.0 14 Present_In=12.0 14 Present_In=12.1 14 Present_In=7.0 14 Present_In=8.0 14 Present_In=9.0 14 Present_In=V10_0 14 Present_In=V11_0 14 Present_In=V12_0 14 Present_In=V12_1 14 Present_In=V7_0 14 Present_In=V8_0 14 Present_In=V9_0 14 Print 14 SC=Zyyy 14 Script=Common 14 Script=Zyyy 14 Script_Extensions=Common 14 Script_Extensions=Zyyy 14 Scx=Zyyy 14 Unicode 14 X_POSIX_Graph 14 X_POSIX_Print 14 Zyyy 13 LB=QU 13 Line_Break=QU 13 Line_Break=Quotation 13 SB=CL 13 Sentence_Break=CL 13 Sentence_Break=Close 12 Age=1.1 12 Age=V1_1 12 IN=1.1 12 IN=2.0 12 IN=2.1 12 IN=3.0 12 IN=3.1 12 IN=3.2 12 IN=4.0 12 IN=4.1 12 IN=5.0 12 IN=5.1 12 IN=5.2 12 IN=6.0 12 IN=6.1 12 IN=6.2 12 IN=6.3 12 PatSyn 12 Pat_Syn 12 Pattern_Syntax 12 Present_In=1.1 12 Present_In=2.0 12 Present_In=2.1 12 Present_In=3.0 12 Present_In=3.1 12 Present_In=3.2 12 Present_In=4.0 12 Present_In=4.1 12 Present_In=5.0 12 Present_In=5.1 12 Present_In=5.2 12 Present_In=6.0 12 Present_In=6.1 12 Present_In=6.2 12 Present_In=6.3 12 Present_In=V2_0 12 Present_In=V2_1 12 Present_In=V3_0 12 Present_In=V3_1 12 Present_In=V3_2 12 Present_In=V4_0 12 Present_In=V4_1 12 Present_In=V5_0 12 Present_In=V5_1 12 Present_In=V5_2 12 Present_In=V6_0 12 Present_In=V6_1 12 Present_In=V6_2 12 Present_In=V6_3 10 Vertical_Orientation=R 10 Vertical_Orientation=Rotated 10 Vo=R 10 WB=XX 10 Word_Break=Other 10 Word_Break=XX 10 X_POSIX_Punct 9 Is_Punctuation 9 P 9 Punct 9 Punctuation 9 QMark 9 Quotation_Mark 9 \pP 7 East_Asian_Width=Neutral 5 BLK=Punctuation 5 Block=General_Punctuation 5 Block=Punctuation 5 General_Punctuation 5 InPunctuation 5 S 5 Symbol 5 \pS 4 CI 4 Case_Ignorable 4 EA=A 4 East_Asian_Width=A 4 East_Asian_Width=Ambiguous 4 Initial_Punctuation 4 Other_Symbol 4 Pi 4 So 4 Vertical_Orientation=U 4 Vertical_Orientation=Upright 4 Vo=U 4 \p{Pi} 4 \p{So} 3 ASCII 3 BLK=ASCII 3 Basic_Latin 3 Block=ASCII 3 Block=Basic_Latin 3 EA=Na 3 East_Asian_Width=Na 3 East_Asian_Width=Narrow 3 Final_Punctuation 3 POSIX_Graph 3 POSIX_Print 3 POSIX_Punct 3 Pf 3 \p{Pf} 2 Age=7.0 2 Age=V7_0 2 BLK=Latin1 2 BidiM 2 Bidi_M 2 Bidi_Mirrored 2 Block=Dingbats 2 Block=Latin_1 2 Block=Latin_1_Sup 2 Block=Latin_1_Supplement 2 Block=Ornamental_Dingbats 2 Dingbats 2 InLatin1 2 Latin_1 2 Latin_1_Sup 2 Latin_1_Supplement 2 Ornamental_Dingbats 2 Other_Punctuation 2 Po 2 WB=MB 2 Word_Break=MB 2 Word_Break=MidNumLet 2 \p{Po} 1 Dia 1 Diacritic 1 LB=AL 1 Line_Break=AL 1 Line_Break=Alphabetic 1 Modifier_Symbol 1 SB=XX 1 Sentence_Break=Other 1 Sentence_Break=XX 1 Sk 1 U+0022 ‹"› \N{QUOTATION MARK} 1 U+0027 ‹'› \N{APOSTROPHE} 1 U+0060 ‹`› \N{GRAVE ACCENT} 1 U+00AB ‹л› \N{LEFT-POINTING DOUBLE ANGLE QUOTATION MARK} 1 U+00BB ‹╗› \N{RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK} 1 U+1F676 ‹🙶› \N{SANS-SERIF HEAVY DOUBLE TURNED COMMA QUOTATIO +N MARK ORNAMENT} 1 U+1F677 ‹🙷› \N{SANS-SERIF HEAVY DOUBLE COMMA QUOTATION MARK +ORNAMENT} 1 U+2018 ‹С› \N{LEFT SINGLE QUOTATION MARK} 1 U+2019 ‹Т› \N{RIGHT SINGLE QUOTATION MARK} 1 U+201B ‹‛› \N{SINGLE HIGH-REVERSED-9 QUOTATION MARK} 1 U+201C ‹У› \N{LEFT DOUBLE QUOTATION MARK} 1 U+201D ‹Ф› \N{RIGHT DOUBLE QUOTATION MARK} 1 U+275D ‹❝› \N{HEAVY DOUBLE TURNED COMMA QUOTATION MARK ORNAME +NT} 1 U+275E ‹❞› \N{HEAVY DOUBLE COMMA QUOTATION MARK ORNAMENT} 1 WB=DQ 1 WB=SQ 1 Word_Break=DQ 1 Word_Break=Double_Quote 1 Word_Break=SQ 1 Word_Break=Single_Quote 1 \p{Sk} [download] We consult perluniprops and see that no interesting properly matches all 14. This is not surprising, since we have charcters from two General Categories. `9 Punctuation 5 Symbol` [download] The 9 punctuation characters all match `\p{Quotation_Mark}` aka `\p{QMark}`! This is the full set of quotation marks: $ unichars -au '\p{QMark}' \| cat ‭ " U+00022 QUOTATION MARK ‭ ' U+00027 APOSTROPHE ‭ л U+000AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK ‭ ╗ U+000BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK ‭ С U+02018 LEFT SINGLE QUOTATION MARK ‭ Т U+02019 RIGHT SINGLE QUOTATION MARK ‭ В U+0201A SINGLE LOW-9 QUOTATION MARK ‭ ‛ U+0201B SINGLE HIGH-REVERSED-9 QUOTATION MARK ‭ У U+0201C LEFT DOUBLE QUOTATION MARK ‭ Ф U+0201D RIGHT DOUBLE QUOTATION MARK ‭ Д U+0201E DOUBLE LOW-9 QUOTATION MARK ‭ ‟ U+0201F DOUBLE HIGH-REVERSED-9 QUOTATION MARK ‭ Л U+02039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK ‭ Ы U+0203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK ‭ ⹂ U+02E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK ‭ 「 U+0300C LEFT CORNER BRACKET ‭ 」 U+0300D RIGHT CORNER BRACKET ‭ 『 U+0300E LEFT WHITE CORNER BRACKET ‭ 』 U+0300F RIGHT WHITE CORNER BRACKET ‭ 〝 U+0301D REVERSED DOUBLE PRIME QUOTATION MARK ‭ 〞 U+0301E DOUBLE PRIME QUOTATION MARK ‭ 〟 U+0301F LOW DOUBLE PRIME QUOTATION MARK ‭ ﹁ U+0FE41 PRESENTATION FORM FOR VERTICAL LEFT CORNER BRACKET ‭ ﹂ U+0FE42 PRESENTATION FORM FOR VERTICAL RIGHT CORNER BRACKET ‭ ﹃ U+0FE43 PRESENTATION FORM FOR VERTICAL LEFT WHITE CORNER BRACKET ‭ ﹄ U+0FE44 PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET ‭ ＂ U+0FF02 FULLWIDTH QUOTATION MARK ‭ ＇ U+0FF07 FULLWIDTH APOSTROPHE ‭ ｢ U+0FF62 HALFWIDTH LEFT CORNER BRACKET ‭ ｣ U+0FF63 HALFWIDTH RIGHT CORNER BRACKET The 5 symbol characters aren't in any useful category. The symbols are: `$ grep -L ^QMark$ props- \ \| perl -CS -ne' use 5.014; use warnings; use charnames qw( ); s/^props-//; $_ = ord($_); printf "U+%05X %s\n", $_, charnames::viacode($_); ' \ \| sort U+00060 GRAVE ACCENT U+0275D HEAVY DOUBLE TURNED COMMA QUOTATION MARK ORNAMENT U+0275E HEAVY DOUBLE COMMA QUOTATION MARK ORNAMENT U+1F676 SANS-SERIF HEAVY DOUBLE TURNED COMMA QUOTATION MARK ORNAMENT U+1F677 SANS-SERIF HEAVY DOUBLE COMMA QUOTATION MARK ORNAMENT` [download] So you could use [\p{QMark}`\N{U+275D}\N{U+275E}\N{U+1F676}\N{U+1F677}] [download] Four of those you listed have "QUOTATION MARK" in their name, so why aren't they matched by `\p{QMark}`? Well, they're actually "QUOTATION MARK ORNAMENT". U+275D, U+275E, U+1F676 and U+1F677 are all dingbats (emojis from before emojis was a word, kinda). They're not meant for use in text. There are three more of these: `U+0275B HEAVY SINGLE TURNED COMMA QUOTATION MARK ORNAMENT U+0275C HEAVY SINGLE COMMA QUOTATION MARK ORNAMENT U+1F678 SANS-SERIF HEAVY LOW DOUBLE COMMA QUOTATION MARK ORNAMENT` [download] Finally, this table also points out two braille characters you might want to include. `U+02826 BRAILLE PATTER DOTS-236 U+02834 BRAILLE PATTER DOTS-356` [download] Update: Added the first section (the summary/"tl;dr") and the last section about why the 5 aren't quotation marks. Added the suggestion for additions to the list. Small wording tweaks.	[reply] [d/l] [select]