Commit graph

68 commits

Author SHA1 Message Date
Ting-Yu Lin
653f4b3694 Bug 1745113 Part 5 - Make grapheme cluster break iterators implement SegmentIteratorUtf16, and adapt the callers. r=necko-reviewers,jfkthame,kershaw
This is the main patch for the bug. It aims to change the grapheme cluster
break's `Next()` API by implementing SegmentIteratorUtf16 interface, and adapt
the callers. It shouldn't change the behavior.

While rewriting the caller, one caveat worth mentioning is the loop termination
condition. If the old code relies on `!AtEnd()` as the loop termination
condition, and it advances the iterator at the end of the loop, it meant
to *skip* its logic when the break position is at the end of the string. For
example, see the `mozTXTToHTMLConv::NumberOfMatches`.

This patch also hooks grapheme cluster break iterator into
Segmenter::TryCreate() interface.

Existing test coverage for the file changed:
- netwerk/test/unit/test_mozTXTToHTMLConv.js
- layout/reftests/forms/input/file/dynamic-max-width.html

Differential Revision: https://phabricator.services.mozilla.com/D135643
2022-01-13 18:36:04 +00:00
Ting-Yu Lin
e418a257b5 Bug 1745113 Part 3 - Change CountGraphemeClusters() to take a Span parameter. r=jfkthame
Differential Revision: https://phabricator.services.mozilla.com/D135641
2022-01-13 18:36:04 +00:00
Ting-Yu Lin
948b11a2dc Bug 1745113 Part 2 - Move ClusterReverseIterator into Segmenter.h, and rename it. r=necko-reviewers,kershaw
Include "nsLayoutUtils.h" in nsFileControlFrame to get rid of warnings in my
editor because it uses utilities such as `nsLayoutUtils::AppUnitWidthOfString`.
We compile it without issues because of unified build.

Differential Revision: https://phabricator.services.mozilla.com/D135640
2022-01-13 18:36:03 +00:00
Ting-Yu Lin
e522533f4e Bug 1745113 Part 1 - Move ClusterIterator into Segmenter.h, and rename it. r=necko-reviewers,kershaw
This patch doesn't change the behavior. Just move the code around.

Differential Revision: https://phabricator.services.mozilla.com/D135639
2022-01-13 18:36:03 +00:00
Dan Minor
e12c3387e8 Bug 1719554 - Unify most of nsUnicodeProperties.h; r=platform-i18n-reviewers,jfkthame,gregtatum,necko-reviewers,valentin
This unifies most of the calls in nsUnicodeProperties.h. CharType and Script
will be handled in subsequent patches on this bug.

Differential Revision: https://phabricator.services.mozilla.com/D132273
2021-12-06 18:15:49 +00:00
Butkovits Atila
56c46d06a1 Backed out 3 changesets (bug 1719554) for causing bustages complaining about gfxTextRun.cpp.
Backed out changeset 6181e40d4da1 (bug 1719554)
Backed out changeset c261ede6ae81 (bug 1719554)
Backed out changeset 221ec418475c (bug 1719554)
2021-12-04 00:58:15 +02:00
Dan Minor
c0ebed22d3 Bug 1719554 - Unify most of nsUnicodeProperties.h; r=platform-i18n-reviewers,jfkthame,gregtatum,necko-reviewers,valentin
This unifies most of the calls in nsUnicodeProperties.h. CharType and Script
will be handled in subsequent patches on this bug.

Differential Revision: https://phabricator.services.mozilla.com/D132273
2021-12-03 20:49:31 +00:00
Jonathan Kew
72e566334e Bug 1726570 - Accelerate nsFind by precomputing a const SharedBitSet for IsCombiningDiacritic. r=emilio
No user-visible change to behavior, except that searching a huge document
becomes slightly quicker.

Differential Revision: https://phabricator.services.mozilla.com/D123114
2021-08-23 14:17:54 +00:00
Alex Henrie
b24e5f0f51 Bug 1697076 - Drop assertion from mozilla::unicode::GetNaked. r=jfkthame
Differential Revision: https://phabricator.services.mozilla.com/D107942
2021-03-11 09:42:18 +00:00
Alex Henrie
f8f015b22e Bug 1649187 - Fix diacritic stripping for characters outside the BMP. r=jfkthame
Due to an unfortunate typo I made in base_chars.py, I thought that there
were no mappings we care about outside of the basic multilingual plane.
This patch adds back the non-BMP mappings that we do care about.

Differential Revision: https://phabricator.services.mozilla.com/D107404
2021-03-10 12:08:49 +00:00
Alex Henrie
0686831376 Bug 1649187 - Use a fallback table to strip diacritics from non-decomposable characters. r=jfkthame
Implement the design suggested at
https://bugzilla.mozilla.org/show_bug.cgi?id=1652910#c5

Differential Revision: https://phabricator.services.mozilla.com/D106674
2021-03-07 16:17:41 +00:00
Jonathan Kew
4a5876c846 Bug 1624244 - Exclude Japanese characters KATAKANA-HIRAGANA [SEMI-]VOICED SOUND MARK from the diacritics that can be ignored during search. r=m_kato
Differential Revision: https://phabricator.services.mozilla.com/D67834

--HG--
extra : moz-landing-system : lando
2020-03-30 13:53:20 +00:00
Alex Henrie
676b1a533d Bug 1614868 - Ignore combining diacritic characters in history search. r=jfkthame,mak
IsCombiningDiacritic(-1) returns false, so there is no need to specially
handle -1 in GetLowerUTF8Codepoint_inline.

It is no longer necessary for GetNaked to check whether a character is a
combining character because all callers now skip combining diacritics
and GetNaked already makes sure that decomposition removes a diacritic
and not something else.

Differential Revision: https://phabricator.services.mozilla.com/D62533

--HG--
extra : moz-landing-system : lando
2020-02-17 20:42:04 +00:00
Alex Henrie
d346ee224f Bug 1611568 - Ignore combining diacritic characters when "Match Diacritics" is off. r=jfkthame
Differential Revision: https://phabricator.services.mozilla.com/D61081

--HG--
extra : moz-landing-system : lando
2020-02-10 18:09:05 +00:00
Alex Henrie
00867c4809 Bug 202251 - Add an option to ignore diacritics when searching. r=fluent-reviewers,mikedeboer,jfkthame,flod
Differential Revision: https://phabricator.services.mozilla.com/D51841

--HG--
extra : moz-landing-system : lando
2019-12-09 19:26:40 +00:00
Brindusan Cristian
4b11b63400 Backed out changeset b89936db7178 (bug 202251) for bc failures at browser_misused_characters_in_strings.js. CLOSED TREE 2019-12-05 23:10:09 +02:00
Alex Henrie
ca467c4b3f Bug 202251 - Add an option to ignore diacritics when searching. r=fluent-reviewers,mikedeboer,jfkthame,flod
Differential Revision: https://phabricator.services.mozilla.com/D51841

--HG--
extra : moz-landing-system : lando
2019-12-05 18:08:20 +00:00
Alex Henrie
74cc0f4dce Bug 1591490 - Use the NS_IS_SURROGATE_PAIR macro everywhere. r=Ehsan
Differential Revision: https://phabricator.services.mozilla.com/D50697

--HG--
extra : moz-landing-system : lando
2019-10-27 05:05:51 +00:00
Sylvestre Ledru
ef0bfc3822 Bug 1519636 - Reformat recent changes to the Google coding style r=Ehsan
# ignore-this-changeset

Differential Revision: https://phabricator.services.mozilla.com/D24168

--HG--
extra : moz-landing-system : lando
2019-03-31 15:12:55 +00:00
Jonathan Kew
1327b56edd Bug 1529241 - Handle emoji-zwj sequences in unicode::ClusterIterator so that we avoid breaking them across lines or during selection. r=m_kato
Depends on D25100

Differential Revision: https://phabricator.services.mozilla.com/D25101

--HG--
extra : moz-landing-system : lando
2019-03-28 09:57:40 +00:00
Tooru Fujisawa
7983faeb5d Bug 1511393 - Use c-basic-offset: 2 in Emacs mode line for C/C++ code. r=nbp 2018-12-01 04:52:05 +09:00
Benjamin Bouvier
a7f1d173a0 Bug 1511383: Update vim modelines after clang-format; r=sylvestre
- modify line wrap up to 80 chars; (tw=80)
- modify size of tab to 2 chars everywhere; (sts=2, sw=2)

--HG--
extra : rebase_source : 7eedce0311b340c9a5a1265dc42d3121cc0f32a0
extra : amend_source : 9cb4ffdd5005f5c4c14172390dd00b04b2066cd7
2018-11-30 16:39:55 +01:00
Sylvestre Ledru
265e672179 Bug 1511181 - Reformat everything to the Google coding style r=ehsan a=clang-format
# ignore-this-changeset

--HG--
extra : amend_source : 4d301d3b0b8711c4692392aa76088ba7fd7d1022
2018-11-30 11:46:48 +01:00
Ehsan Akhgari
ca162bee20 Bug 1508472 - Part 4: Fourth batch of comment fix-ups in preparation for the tree reformat r=sylvestre
This is a best effort attempt at ensuring that the adverse impact of
reformatting the entire tree over the comments would be minimal.  I've used a
combination of strategies including disabling of formatting, some manual
formatting and some changes to formatting to work around some clang-format
limitations.

Differential Revision: https://phabricator.services.mozilla.com/D13193

--HG--
extra : moz-landing-system : lando
2018-11-28 09:16:55 +00:00
Jonathan Kew
98cb122caa Bug 1426827 - Treat Fitzpatrick skin-tone modifiers as cluster extenders when building textruns. r=m_kato 2018-07-25 09:38:10 +01:00
Jonathan Kew
c5cd6a1621 Bug 1477010 - Treat plane-14 tag characters as cluster extenders when building textruns, so that emoji flag sequences behave as single units. r=m_kato 2018-07-25 09:38:07 +01:00
Chris Peterson
2afd829d0f Bug 1469769 - Part 6: Replace non-failing NS_NOTREACHED with MOZ_ASSERT_UNREACHABLE. r=froydnj
This patch is an automatic replacement of s/NS_NOTREACHED/MOZ_ASSERT_UNREACHABLE/. Reindenting long lines and whitespace fixups follow in patch 6b.

MozReview-Commit-ID: 5UQVHElSpCr

--HG--
extra : rebase_source : 4c1b2fc32b269342f07639266b64941e2270e9c4
extra : source : 907543f6eae716f23a6de52b1ffb1c82908d158a
2018-06-17 22:43:11 -07:00
Jonathan Kew
6472fbd7dc Bug 1402271 - patch 3 - Remove non-ENABLE_INTL_API code paths from the nsUnicodeProperties code. r=m_kato 2017-09-25 09:18:20 +01:00
Xidorn Quan
ac6cc1d36a Bug 1368418 part 3 - Remove nsCategoryImp. r=emk
MozReview-Commit-ID: 5qCoeqfM2s5

--HG--
extra : rebase_source : 6dc1693ce61bea4ec35469a3388c75a9fb64e5b3
2017-05-29 16:17:39 +10:00
Jonathan Kew
b809e13f8d Bug 1281448 - part 1+2 - Update character property table generator script for Unicode 9 (in particular, security/xidmodifications.txt is replaced by security/IdentifierStatus.txt and IdentifierType.txt), and adjust APIs to fit the new identifier-type property model; update the generated data files. r=m_kato 2016-11-14 09:23:49 +00:00
Sebastian Hengst
4f23e5acc2 Backed out changeset 5d9a785a37c4 (bug 1281448) for Android bustage. r=backout 2016-11-14 10:45:52 +01:00
Jonathan Kew
51e4a42011 Bug 1281448 - part 1+2 - Update character property table generator script for Unicode 9 (in particular, security/xidmodifications.txt is replaced by security/IdentifierStatus.txt and IdentifierType.txt), and adjust APIs to fit the new identifier-type property model; update the generated data files. r=m_kato 2016-11-14 09:23:49 +00:00
Kan-Ru Chen
dc45f1b5b3 Bug 1081858 - Part 3. Implement IsEastAsianWidthFWH using ICU or nsUnicodeProperties data. r=jfkthame
MozReview-Commit-ID: DvBgSm5SJwD
2016-10-27 14:52:22 +08:00
Kan-Ru Chen
eb2f3cfed9 Bug 1081858 - Part 2. Add EastAsianWidthFWH data from Unicode's EastAsianWidth.txt to nsUnicodeProperties for builds without ICU. r=jfkthame
MozReview-Commit-ID: EOtAPx5ZY1U
2016-10-27 14:52:21 +08:00
Sebastian Hengst
c31797b642 Backed out changeset 1d3177608997 (bug 1081858) 2016-10-26 18:49:07 +02:00
Sebastian Hengst
ff84c3bee2 Backed out changeset 763deb5caa30 (bug 1081858) 2016-10-26 18:49:07 +02:00
Kan-Ru Chen
056f964bb8 Bug 1081858 - Part 3. Implement IsEastAsianWidthFWH using ICU or nsUnicodeProperties data. r=jfkthame
MozReview-Commit-ID: DvBgSm5SJwD
2016-10-26 19:15:27 +08:00
Kan-Ru Chen
f21980e3e5 Bug 1081858 - Part 2. Add EastAsianWidthFWH data from Unicode's EastAsianWidth.txt to nsUnicodeProperties for builds without ICU. r=jfkthame
MozReview-Commit-ID: EOtAPx5ZY1U
2016-10-26 19:15:27 +08:00
Jonathan Kew
3a650cc4a5 Bug 1312440 - Remove (unused) paired bracket data from our Unicode property tables when ICU is available. r=emk 2016-10-26 09:40:20 +01:00
Phil Ringnalda
93eb57bc06 Backed out 5 changesets (bug 1081858) for Android line-breaking reftest failures
Backed out changeset ac6306117c61 (bug 1081858)
Backed out changeset d9e96e907d0a (bug 1081858)
Backed out changeset 0dd35a1f895f (bug 1081858)
Backed out changeset ba420f595902 (bug 1081858)
Backed out changeset 44f9c7e8d124 (bug 1081858)

MozReview-Commit-ID: LV4YOozX3Ol
2016-10-25 20:38:20 -07:00
Kan-Ru Chen
ee45259740 Bug 1081858 - Followup, initialize nsCharProps2 properly. on a CLOSED TREE r=bustage
MozReview-Commit-ID: 2NHBuOsceOL
2016-10-26 09:28:41 +08:00
Kan-Ru Chen
62f72040da Bug 1081858 - Part 3. Implement IsEastAsianWidthFWH using ICU or nsUnicodeProperties data. r=jfkthame 2016-10-26 08:37:04 +08:00
Jonathan Kew
594fdb205d Bug 1305700 - pt 3 & 4 - Clean up/simplify use of ENABLE_INTL_API conditionals in nsUnicodeProperties (code rearrangement, no change in behavior). r=m_kato 2016-09-28 10:52:51 +01:00
Jonathan Kew
7f21325a4a Bug 1305700 - pt 1 & 2 - Exclude case mappings from nsUnicodePropertyData.cpp, and use ICU case mappings instead of our own table when building with ENABLE_INTL_API. r=m_kato 2016-09-28 10:47:05 +01:00
Xidorn Quan
8c11d66ab2 Bug 898984 - Part 1: Add ClusterReverseIterator in nsUnicodeProperties. r=jfkthame 2013-08-11 03:37:00 +09:00
Jonathan Kew
0b98a9737f Bug 1265631 - patch 2 - Implement GetLineBreakClass() accessor to get Unicode line-break class from ICU or nsUnicodeProperties data. r=masayuki 2016-04-26 10:32:17 +01:00
Xidorn Quan
19931babd5 Bug 1097499 part 8 - Move CountGraphemeClusters to mozilla::unicode. r=emk
MozReview-Commit-ID: J9yR8RPs5u8

--HG--
extra : source : 7b937b3ba984e84da808cd072037726b56da1826
2016-04-22 09:18:41 +10:00
Xidorn Quan
20a1eb1cd7 Bug 1097499 part 7 - Add reverse function of GetFullWidth. r=emk
MozReview-Commit-ID: HRDoZPzr1GO

--HG--
extra : source : 84cb256d16e07d5316db23d5a08353cc7f1abe2a
2016-04-22 09:18:41 +10:00
Jonathan Kew
c60f6a1ae4 Bug 1266391 - Introduce an enum class mozilla::unicode::Script, and use this instead of bare integers to specify script codes for better type checking. r=masayuki 2016-04-21 18:58:59 +01:00
Jonathan Kew
06f42574aa Bug 724538 - When ICU is available in the build, replace most of nsCharProps2 fields with ICU property accessors. r=emk 2016-01-13 15:45:22 +00:00