Commit 2eb5f4de authored by darin@apple.com's avatar darin@apple.com

Change most call sites to call ICU directly instead of through WTF::Unicode

https://bugs.webkit.org/show_bug.cgi?id=122635

Reviewed by Alexey Proskuryakov.

Source/JavaScriptCore:

* parser/Lexer.cpp:
(JSC::isNonLatin1IdentStart): Take a UChar since that's what the only caller wants to pass.
Use U_GET_GC_MASK instead of WTF::Unicode::category.
(JSC::isNonLatin1IdentPart): Ditto.

* parser/Lexer.h:
(JSC::Lexer::isWhiteSpace): Use u_charType instead of WTF::Unicode::isSeparatorSpace.

* runtime/JSFunction.cpp: Removed "using namespace" for WTF::Unicode, this will no longer
compile since this doesn't include anything that defines that namespace.

* runtime/JSGlobalObjectFunctions.cpp:
(JSC::isStrWhiteSpace): Use u_charType instead of WTF::Unicode::isSeparatorSpace.

* yarr/YarrInterpreter.cpp:
(JSC::Yarr::ByteCompiler::atomPatternCharacter): Use u_tolower and u_toupper instead of
Unicode::toLower and Unicode::toUpper. Also added some assertions since this code assumes
it can convert any UChar to lowercase or uppercase in another UChar, with no risk of needing
a UChar32 for the result. I guess that's probably true, but it would be good to know in a
debug build if not.

Source/WebCore:

* Modules/indexeddb/IDBKeyPath.cpp:
(isIdentifierStartCharacter): Use U_GET_GC_MASK instead of WTF::Unicode::category.
(isIdentifierCharacter): Ditto.

* css/CSSParser.cpp:
(WebCore::makeLower): Use u_tolower instead of WTF::Unicode::toLower.
Also assert the character fits in a UChar.

* dom/Document.cpp:
(WebCore::isValidNameStart): Use U_GET_GC_MASK instead of WTF::Unicode::category,
and u_getIntPropertyValue instead of WTF::Unicode::decompositionType.
(WebCore::isValidNamePart): Ditto.
(WebCore::canonicalizedTitle): Ditto.

* editing/Editor.cpp:
(WebCore::Editor::insertTextWithoutSendingTextEvent): Use u_isPunct instead of
WTF::Unicode::isPunct.

* editing/TextIterator.cpp:
(WebCore::SearchBuffer::append): Use u_strFoldCase instead of WTF::Unicode::foldCase.

* html/HTMLElement.cpp:
(WebCore::HTMLElement::directionality): Use UCharDirection instead of
WTF::Unicode::Direction.

* html/HTMLSelectElement.cpp:
(WebCore::HTMLSelectElement::defaultEventHandler): Use u_isprint instead of
WTF::Unicode::isPrintableChar.

* html/TypeAhead.cpp:
(WebCore::stripLeadingWhiteSpace): Use u_charDirection instead of
WTF::Unicode::direction.

* html/track/TextTrackCue.cpp:
(WebCore::isCueParagraphSeparator): Use u_charType instead of
WTF::Unicode::category.
(WebCore::TextTrackCue::determineTextDirection): Use u_charDirection instead of
WTF::Unicode::direction.

* page/ContextMenuController.cpp:
(WebCore::selectionContainsPossibleWord): Use U_GET_GC_MASK instead of
WTF::Unicode::category.
* platform/graphics/Font.cpp:
(WebCore::Font::canReceiveTextEmphasis): Ditto.

* platform/graphics/FontGlyphs.cpp:
(WebCore::FontGlyphs::glyphDataAndPageForCharacter): Use u_toupper instead of
WTF::Unicode::toUpper. Use u_charMirror instead of WTF::Unicode::mirroredChar.

* platform/graphics/GraphicsContext.cpp:
(WebCore::TextRunIterator::direction): Use u_charDirection instead of
WTF::Unicode::direction.

* platform/graphics/SVGGlyph.cpp:
(WebCore::charactersWithArabicForm): Use ublock_getCode instead of
WTF::Unicode::isArabicChar.

* platform/graphics/SurrogatePairAwareTextIterator.cpp:
(WebCore::SurrogatePairAwareTextIterator::normalizeVoicingMarks): Use
u_getCombiningClass instead of WTF::Unicode::combiningClass.

* platform/graphics/WidthIterator.cpp:
(WebCore::WidthIterator::advanceInternal): Use u_toupper instead of
WTF::Unicode::toUpper.

* platform/graphics/mac/ComplexTextController.cpp:
(WebCore::ComplexTextController::collectComplexTextRuns): Added some
assertions about the user of u_toupper and tweaked coding style a bit.

* platform/text/BidiContext.cpp:
(WebCore::BidiContext::createUncached): Use UCharDirection instead of
WTF::Unicode::Direction.
(WebCore::BidiContext::create): Ditto.
(WebCore::copyContextAndRebaselineLevel): Ditto.
* platform/text/BidiContext.h:
(WebCore::BidiContext::dir): Ditto.
(WebCore::BidiContext::BidiContext): Ditto.
* platform/text/BidiResolver.h:
(WebCore::BidiStatus::BidiStatus): Ditto.
(WebCore::BidiEmbedding::BidiEmbedding): Ditto.
(WebCore::BidiEmbedding::direction): Ditto.
(WebCore::BidiCharacterRun::BidiCharacterRun): Ditto.
(WebCore::BidiResolver::BidiResolver): Ditto.
(WebCore::BidiResolver::setLastDir): Ditto.
(WebCore::BidiResolver::setLastStrongDir): Ditto.
(WebCore::BidiResolver::setEorDir): Ditto.
(WebCore::BidiResolver::dir): Ditto.
(WebCore::BidiResolver::setDir): Ditto.
(WebCore::BidiResolver::appendRun): Ditto.
(WebCore::BidiResolver::embed): Ditto.
(WebCore::BidiResolver::checkDirectionInLowerRaiseEmbeddingLevel): Ditto.
(WebCore::BidiResolver::lowerExplicitEmbeddingLevel): Ditto.
(WebCore::BidiResolver::raiseExplicitEmbeddingLevel): Ditto.
(WebCore::BidiResolver::commitExplicitEmbedding): Ditto.
(WebCore::BidiResolver::updateStatusLastFromCurrentDirection): Ditto.
(WebCore::BidiResolver::createBidiRunsForLine): Ditto.

* platform/text/SegmentedString.h:
(WebCore::SegmentedString::advanceAndASSERTIgnoringCase): Use u_foldCase
instead of WTF::Unicode::foldCase.

* platform/text/TextBoundaries.cpp:
(WebCore::findNextWordFromIndex): Use u_isalnum instead of
WTF::Unicode::isAlphanumeric.

* platform/text/TextBoundaries.h:
(WebCore::requiresContextForWordBoundary): Use u_getIntPropertyValue directly
instead of WTF::Unicode::requiresComplexContextForWordBreaking.

* platform/text/mac/TextBoundaries.mm: Removed explicit use of WTF::Unicode,
which was unneeded and also will no longer compile.

* rendering/BidiRun.h:
(WebCore::BidiRun::BidiRun): Use UCharDirection instead of WTF::Unicode::Direction.
* rendering/InlineFlowBox.h: Ditto.
* rendering/InlineIterator.h:
(WebCore::embedCharFromDirection): Ditto.
(WebCore::notifyObserverWillExitObject): Ditto.
(WebCore::InlineIterator::direction): Ditto.
(WebCore::IsolateTracker::embed): Ditto.
(WebCore::InlineBidiResolver::appendRun): Ditto.

* rendering/RenderBlock.cpp:
(WebCore::isPunctuationForFirstLetter): Use U_GET_GC_MASK instead of
WTF::Unicode::category.

* rendering/RenderBlockLineLayout.cpp:
(WebCore::determineDirectionality): Use u_charDirection instead of
WTF::Unicode::direction.
(WebCore::RenderBlockFlow::handleTrailingSpaces): Ditto.
(WebCore::statusWithDirection): Ditto.
(WebCore::LineBreaker::nextSegmentBreak): Use U_GET_GC_MASK instead of
WTF::Unicode::category.

* rendering/RenderListMarker.cpp:
(WebCore::RenderListMarker::paint): Use u_charDirection instead of
WTF::Unicode::direction.

* rendering/RenderMenuList.cpp:
(WebCore::RenderMenuList::adjustInnerStyle): Use UCharDirection instead of
WTF::Unicode::Direction.

* rendering/RenderText.cpp:
(WebCore::makeCapitalized): Use u_totile instead of WTF::Unicode::toTitleCase.
Also added a comment about the fact that we need to use u_strToTitle instead.

* rendering/RootInlineBox.cpp:
(WebCore::RootInlineBox::lineBreakBidiStatus): Use UCharDirection instead of
WTF::Unicode::Direction.

* svg/SVGFontData.cpp:
(WebCore::SVGFontData::createStringWithMirroredCharacters): Use u_charMirror
instead of WTF::Unicode::mirroredChar.

* xml/XPathParser.cpp:
(WebCore::XPath::charCat): Use U_GET_GC_MASK instead of WTF::Unicode::category.

* platform/graphics/win/UniscribeController.cpp:
(WebCore::UniscribeController::advance):
* platform/win/PopupMenuWin.cpp:
(WebCore::PopupMenuWin::paint):
* platform/win/WebCoreTextRenderer.cpp:
(WebCore::isOneLeftToRightRun):
More of the same for Windows.

Source/WTF:

* wtf/text/StringHash.h:
(WTF::CaseFoldingHash::foldCase): Use u_foldCase instead of WTF::Unicode::foldCase.
(WTF::CaseFoldingHash::hash): Added an overload for a StringImpl& because why not.

* wtf/text/StringImpl.cpp:
(WTF::StringImpl::lower): Use u_tolower rather than WTF::Unicode::toLower. Also added
an assertion to check that the lowercase version is also part of Latin-1. If this
is not guaranteed it would be good to know in a debug build at least. Use u_strToLower
rather than WTF::Unicode::toLower. Also removed #if USE(ICU_UNICODE) around the
locale-specific version.
(WTF::StringImpl::upper): Use u_toupper and u_strToUpper, as above.
(WTF::StringImpl::foldCase): Use u_tolower and u_strFoldCase, as above.
(WTF::equalIgnoringCase): Use u_foldCase instead of WTF::Unicode::foldCase.
(WTF::StringImpl::defaultWritingDirection): Use u_charDirection and UCharDirection
instead of WTF::Unicode::direction and WTF::Unicode::Direction.

* wtf/text/StringImpl.h:
(WTF::equalIgnoringCase): Use u_memcasecmp instead of WTF::Unicode::umemcasecmp.
(WTF::isSpaceOrNewline): Use u_charDirection instead of WTF::Unicode::direction.

* wtf/text/WTFString.h:
(WTF::String::defaultWritingDirection): Use UCharDirection instead of WTF::Unicode::Direction.

* wtf/unicode/icu/UnicodeIcu.h: Removed almost everything.

* wtf/unicode/wchar/UnicodeWchar.cpp: Tried to do the right thing in this file, but
I did not actually compile it. Also, the implementations here aren't really sufficient
to make WebKit work broadly. There are many things that just aren't working with this
implementation, such as parsing that uses u_charType to figure out which characters are valid.
(unorm_normalize): Added.
(u_charDirection): Added.
(u_charMirror): Added.
(u_charType): Added.
(u_getCombiningClass): Added.
(u_getIntPropertyValue): Added.
(u_memcasecmp): Added.
(convertWithFunction): Changed to work with ICU-style status code instead of error bool.
(u_strFoldCase): Added.
(u_strToLower): Added.
(u_strToUpper): Added.
* wtf/unicode/wchar/UnicodeWchar.h: Ditto. Later this file should just be named like the
real ICU headers so the code can include it the same way it would ICU. But that will be
in a future patch.

git-svn-id: http://svn.webkit.org/repository/webkit/trunk@157330 268f45cc-cd09-0410-ab3c-d52691b4dbfc
parent 008e8dc2
2013-10-11 Darin Adler <darin@apple.com>
Change most call sites to call ICU directly instead of through WTF::Unicode
https://bugs.webkit.org/show_bug.cgi?id=122635
Reviewed by Alexey Proskuryakov.
* parser/Lexer.cpp:
(JSC::isNonLatin1IdentStart): Take a UChar since that's what the only caller wants to pass.
Use U_GET_GC_MASK instead of WTF::Unicode::category.
(JSC::isNonLatin1IdentPart): Ditto.
* parser/Lexer.h:
(JSC::Lexer::isWhiteSpace): Use u_charType instead of WTF::Unicode::isSeparatorSpace.
* runtime/JSFunction.cpp: Removed "using namespace" for WTF::Unicode, this will no longer
compile since this doesn't include anything that defines that namespace.
* runtime/JSGlobalObjectFunctions.cpp:
(JSC::isStrWhiteSpace): Use u_charType instead of WTF::Unicode::isSeparatorSpace.
* yarr/YarrInterpreter.cpp:
(JSC::Yarr::ByteCompiler::atomPatternCharacter): Use u_tolower and u_toupper instead of
Unicode::toLower and Unicode::toUpper. Also added some assertions since this code assumes
it can convert any UChar to lowercase or uppercase in another UChar, with no risk of needing
a UChar32 for the result. I guess that's probably true, but it would be good to know in a
debug build if not.
2013-10-11 Nadav Rotem <nrotem@apple.com>
DFG: Add JIT support for LogicalNot(String/StringIdent)
......@@ -37,9 +37,6 @@
#include <string.h>
#include <wtf/Assertions.h>
using namespace WTF;
using namespace Unicode;
#include "KeywordLookup.h"
#include "Lexer.lut.h"
#include "Parser.h"
......@@ -637,9 +634,9 @@ ALWAYS_INLINE bool Lexer<T>::lastTokenWasRestrKeyword() const
return m_lastToken == CONTINUE || m_lastToken == BREAK || m_lastToken == RETURN || m_lastToken == THROW;
}
static NEVER_INLINE bool isNonLatin1IdentStart(int c)
static NEVER_INLINE bool isNonLatin1IdentStart(UChar c)
{
return category(c) & (Letter_Uppercase | Letter_Lowercase | Letter_Titlecase | Letter_Modifier | Letter_Other);
return U_GET_GC_MASK(c) & U_GC_L_MASK;
}
static ALWAYS_INLINE bool isLatin1(LChar)
......@@ -664,8 +661,7 @@ static inline bool isIdentStart(UChar c)
static NEVER_INLINE bool isNonLatin1IdentPart(int c)
{
return (category(c) & (Letter_Uppercase | Letter_Lowercase | Letter_Titlecase | Letter_Modifier | Letter_Other
| Mark_NonSpacing | Mark_SpacingCombining | Number_DecimalDigit | Punctuation_Connector)) || c == 0x200C || c == 0x200D;
return (U_GET_GC_MASK(c) & (U_GC_L_MASK | U_GC_MN_MASK | U_GC_MC_MASK | U_GC_ND_MASK | U_GC_PC_MASK)) || c == 0x200C || c == 0x200D;
}
static ALWAYS_INLINE bool isIdentPart(LChar c)
......
......@@ -248,7 +248,7 @@ ALWAYS_INLINE bool Lexer<LChar>::isWhiteSpace(LChar ch)
template <>
ALWAYS_INLINE bool Lexer<UChar>::isWhiteSpace(UChar ch)
{
return (ch < 256) ? Lexer<LChar>::isWhiteSpace(static_cast<LChar>(ch)) : (WTF::Unicode::isSeparatorSpace(ch) || ch == 0xFEFF);
return (ch < 256) ? Lexer<LChar>::isWhiteSpace(static_cast<LChar>(ch)) : (u_charType(ch) == U_SPACE_SEPARATOR || ch == 0xFEFF);
}
template <>
......
......@@ -45,10 +45,8 @@
#include "PropertyNameArray.h"
#include "StackVisitor.h"
using namespace WTF;
using namespace Unicode;
namespace JSC {
EncodedJSValue JSC_HOST_CALL callHostFunctionAsConstructor(ExecState* exec)
{
return throwVMError(exec, createNotAConstructorError(exec, exec->callee()));
......
......@@ -169,7 +169,7 @@ bool isStrWhiteSpace(UChar c)
case 0xFEFF:
return true;
default:
return c > 0xff && isSeparatorSpace(c);
return c > 0xFF && u_charType(c) == U_SPACE_SEPARATOR;
}
}
......
......@@ -1510,8 +1510,11 @@ public:
void atomPatternCharacter(UChar ch, unsigned inputPosition, unsigned frameLocation, Checked<unsigned> quantityCount, QuantifierType quantityType)
{
if (m_pattern.m_ignoreCase) {
UChar lo = Unicode::toLower(ch);
UChar hi = Unicode::toUpper(ch);
ASSERT(u_tolower(ch) <= 0xFFFF);
ASSERT(u_toupper(ch) <= 0xFFFF);
UChar lo = u_tolower(ch);
UChar hi = u_toupper(ch);
if (lo != hi) {
m_bodyDisjunction->terms.append(ByteTerm(lo, hi, inputPosition, frameLocation, quantityCount, quantityType));
......
2013-10-11 Darin Adler <darin@apple.com>
Change most call sites to call ICU directly instead of through WTF::Unicode
https://bugs.webkit.org/show_bug.cgi?id=122635
Reviewed by Alexey Proskuryakov.
* wtf/text/StringHash.h:
(WTF::CaseFoldingHash::foldCase): Use u_foldCase instead of WTF::Unicode::foldCase.
(WTF::CaseFoldingHash::hash): Added an overload for a StringImpl& because why not.
* wtf/text/StringImpl.cpp:
(WTF::StringImpl::lower): Use u_tolower rather than WTF::Unicode::toLower. Also added
an assertion to check that the lowercase version is also part of Latin-1. If this
is not guaranteed it would be good to know in a debug build at least. Use u_strToLower
rather than WTF::Unicode::toLower. Also removed #if USE(ICU_UNICODE) around the
locale-specific version.
(WTF::StringImpl::upper): Use u_toupper and u_strToUpper, as above.
(WTF::StringImpl::foldCase): Use u_tolower and u_strFoldCase, as above.
(WTF::equalIgnoringCase): Use u_foldCase instead of WTF::Unicode::foldCase.
(WTF::StringImpl::defaultWritingDirection): Use u_charDirection and UCharDirection
instead of WTF::Unicode::direction and WTF::Unicode::Direction.
* wtf/text/StringImpl.h:
(WTF::equalIgnoringCase): Use u_memcasecmp instead of WTF::Unicode::umemcasecmp.
(WTF::isSpaceOrNewline): Use u_charDirection instead of WTF::Unicode::direction.
* wtf/text/WTFString.h:
(WTF::String::defaultWritingDirection): Use UCharDirection instead of WTF::Unicode::Direction.
* wtf/unicode/icu/UnicodeIcu.h: Removed almost everything.
* wtf/unicode/wchar/UnicodeWchar.cpp: Tried to do the right thing in this file, but
I did not actually compile it. Also, the implementations here aren't really sufficient
to make WebKit work broadly. There are many things that just aren't working with this
implementation, such as parsing that uses u_charType to figure out which characters are valid.
(unorm_normalize): Added.
(u_charDirection): Added.
(u_charMirror): Added.
(u_charType): Added.
(u_getCombiningClass): Added.
(u_getIntPropertyValue): Added.
(u_memcasecmp): Added.
(convertWithFunction): Changed to work with ICU-style status code instead of error bool.
(u_strFoldCase): Added.
(u_strToLower): Added.
(u_strToUpper): Added.
* wtf/unicode/wchar/UnicodeWchar.h: Ditto. Later this file should just be named like the
real ICU headers so the code can include it the same way it would ICU. But that will be
in a future patch.
2013-10-11 Anders Carlsson <andersca@apple.com>
Remove gesture event support from WebCore
......
......@@ -73,26 +73,31 @@ namespace WTF {
class CaseFoldingHash {
public:
template<typename T> static inline UChar foldCase(T ch)
template<typename T> static inline UChar foldCase(T character)
{
return WTF::Unicode::foldCase(ch);
return u_foldCase(character, U_FOLD_CASE_DEFAULT);
}
static unsigned hash(const UChar* data, unsigned length)
{
return StringHasher::computeHashAndMaskTop8Bits<UChar, foldCase<UChar> >(data, length);
return StringHasher::computeHashAndMaskTop8Bits<UChar, foldCase<UChar>>(data, length);
}
static unsigned hash(StringImpl* str)
static unsigned hash(StringImpl& string)
{
if (str->is8Bit())
return hash(str->characters8(), str->length());
return hash(str->characters16(), str->length());
if (string.is8Bit())
return hash(string.characters8(), string.length());
return hash(string.characters16(), string.length());
}
static unsigned hash(StringImpl* string)
{
ASSERT(string);
return hash(*string);
}
static unsigned hash(const LChar* data, unsigned length)
{
return StringHasher::computeHashAndMaskTop8Bits<LChar, foldCase<LChar> >(data, length);
return StringHasher::computeHashAndMaskTop8Bits<LChar, foldCase<LChar>>(data, length);
}
static inline unsigned hash(const char* data, unsigned length)
......@@ -107,7 +112,7 @@ namespace WTF {
static unsigned hash(const RefPtr<StringImpl>& key)
{
return hash(key.get());
return hash(*key);
}
static bool equal(const RefPtr<StringImpl>& a, const RefPtr<StringImpl>& b)
......@@ -129,7 +134,9 @@ namespace WTF {
}
static bool equal(const AtomicString& a, const AtomicString& b)
{
return (a == b) || equal(a.impl(), b.impl());
// FIXME: Is the "a == b" here a helpful optimization?
// It makes all cases where the strings are not identical slightly slower.
return a == b || equal(a.impl(), b.impl());
}
static const bool safeToCompareToEmptyOrDeleted = false;
......
......@@ -415,8 +415,10 @@ SlowPath8bitLower:
LChar character = m_data8[i];
if (!(character & ~0x7F))
data8[i] = toASCIILower(character);
else
data8[i] = static_cast<LChar>(Unicode::toLower(character));
else {
ASSERT(u_tolower(character) <= 0xFF);
data8[i] = static_cast<LChar>(u_tolower(character));
}
}
return newImpl.release();
......@@ -453,14 +455,15 @@ SlowPath8bitLower:
UChar* data16;
RefPtr<StringImpl> newImpl = createUninitializedInternalNonEmpty(m_length, data16);
bool error;
int32_t realLength = Unicode::toLower(data16, length, m_data16, m_length, &error);
if (!error && realLength == length)
UErrorCode status = U_ZERO_ERROR;
int32_t realLength = u_strToLower(data16, length, m_data16, m_length, "", &status);
if (U_SUCCESS(status) && realLength == length)
return newImpl.release();
newImpl = createUninitialized(realLength, data16);
Unicode::toLower(data16, realLength, m_data16, m_length, &error);
if (error)
status = U_ZERO_ERROR;
u_strToLower(data16, realLength, m_data16, m_length, "", &status);
if (U_FAILURE(status))
return this;
return newImpl.release();
}
......@@ -506,7 +509,8 @@ PassRefPtr<StringImpl> StringImpl::upper()
LChar c = m_data8[i];
if (UNLIKELY(c == smallLetterSharpS))
++numberSharpSCharacters;
UChar upper = Unicode::toUpper(c);
ASSERT(u_toupper(c) <= 0xFFFF);
UChar upper = u_toupper(c);
if (UNLIKELY(upper > 0xff)) {
// Since this upper-cased character does not fit in an 8-bit string, we need to take the 16-bit path.
goto upconvert;
......@@ -527,8 +531,10 @@ PassRefPtr<StringImpl> StringImpl::upper()
if (c == smallLetterSharpS) {
*dest++ = 'S';
*dest++ = 'S';
} else
*dest++ = static_cast<LChar>(Unicode::toUpper(c));
} else {
ASSERT(u_toupper(c) <= 0xFF);
*dest++ = static_cast<LChar>(u_toupper(c));
}
}
return newImpl.release();
......@@ -551,13 +557,14 @@ upconvert:
return newImpl.release();
// Do a slower implementation for cases that include non-ASCII characters.
bool error;
int32_t realLength = Unicode::toUpper(data16, length, source16, m_length, &error);
if (!error && realLength == length)
UErrorCode status = U_ZERO_ERROR;
int32_t realLength = u_strToUpper(data16, length, source16, m_length, "", &status);
if (U_SUCCESS(status) && realLength == length)
return newImpl;
newImpl = createUninitialized(realLength, data16);
Unicode::toUpper(data16, realLength, source16, m_length, &error);
if (error)
status = U_ZERO_ERROR;
u_strToUpper(data16, realLength, source16, m_length, "", &status);
if (U_FAILURE(status))
return this;
return newImpl.release();
}
......@@ -574,10 +581,6 @@ static inline bool needsTurkishCasingRules(const AtomicString& localeIdentifier)
RefPtr<StringImpl> StringImpl::lower(const AtomicString& localeIdentifier)
{
#if !USE(ICU_UNICODE)
UNUSED_PARAM(localeIdentifier);
return lower();
#else
// Use the more-optimized code path most of the time.
// Assuming here that the only locale-specific lowercasing is the Turkish casing rules.
// FIXME: Could possibly optimize further by looking for the specific sequences
......@@ -603,21 +606,16 @@ RefPtr<StringImpl> StringImpl::lower(const AtomicString& localeIdentifier)
int realLength = u_strToLower(data16, length, source16, length, "tr", &status);
if (U_SUCCESS(status) && realLength == length)
return newString;
status = U_ZERO_ERROR;
newString = createUninitialized(realLength, data16);
status = U_ZERO_ERROR;
u_strToLower(data16, realLength, source16, length, "tr", &status);
if (U_FAILURE(status))
return this;
return newString.release();
#endif
}
RefPtr<StringImpl> StringImpl::upper(const AtomicString& localeIdentifier)
{
#if !USE(ICU_UNICODE)
UNUSED_PARAM(localeIdentifier);
return upper();
#else
// Use the more-optimized code path most of the time.
// Assuming here that the only locale-specific lowercasing is the Turkish casing rules,
// and that the only affected character is lowercase "i".
......@@ -644,7 +642,6 @@ RefPtr<StringImpl> StringImpl::upper(const AtomicString& localeIdentifier)
if (U_FAILURE(status))
return this;
return newString.release();
#endif
}
PassRefPtr<StringImpl> StringImpl::fill(UChar character)
......@@ -685,8 +682,11 @@ PassRefPtr<StringImpl> StringImpl::foldCase()
return newImpl.release();
// Do a slower implementation for cases that include non-ASCII Latin-1 characters.
for (int32_t i = 0; i < length; ++i)
data[i] = static_cast<LChar>(Unicode::toLower(m_data8[i]));
// FIXME: Shouldn't this use u_foldCase instead of u_tolower?
for (int32_t i = 0; i < length; ++i) {
ASSERT(u_tolower(m_data8[i]) <= 0xFF);
data[i] = static_cast<LChar>(u_tolower(m_data8[i]));
}
return newImpl.release();
}
......@@ -704,13 +704,14 @@ PassRefPtr<StringImpl> StringImpl::foldCase()
return newImpl.release();
// Do a slower implementation for cases that include non-ASCII characters.
bool error;
int32_t realLength = Unicode::foldCase(data, length, m_data16, m_length, &error);
if (!error && realLength == length)
UErrorCode status = U_ZERO_ERROR;
int32_t realLength = u_strFoldCase(data, length, m_data16, m_length, U_FOLD_CASE_DEFAULT, &status);
if (U_SUCCESS(status) && realLength == length)
return newImpl.release();
newImpl = createUninitialized(realLength, data);
Unicode::foldCase(data, realLength, m_data16, m_length, &error);
if (error)
status = U_ZERO_ERROR;
u_strFoldCase(data, realLength, m_data16, m_length, U_FOLD_CASE_DEFAULT, &status);
if (U_FAILURE(status))
return this;
return newImpl.release();
}
......@@ -952,8 +953,7 @@ float StringImpl::toFloat(bool* ok)
bool equalIgnoringCase(const LChar* a, const LChar* b, unsigned length)
{
while (length--) {
LChar bc = *b++;
if (foldCase(*a++) != foldCase(bc))
if (u_foldCase(*a++, U_FOLD_CASE_DEFAULT) != u_foldCase(*b++, U_FOLD_CASE_DEFAULT))
return false;
}
return true;
......@@ -962,8 +962,7 @@ bool equalIgnoringCase(const LChar* a, const LChar* b, unsigned length)
bool equalIgnoringCase(const UChar* a, const LChar* b, unsigned length)
{
while (length--) {
LChar bc = *b++;
if (foldCase(*a++) != foldCase(bc))
if (u_foldCase(*a++, U_FOLD_CASE_DEFAULT) != u_foldCase(*b++, U_FOLD_CASE_DEFAULT))
return false;
}
return true;
......@@ -1911,7 +1910,7 @@ bool equalIgnoringCase(const StringImpl* a, const LChar* b)
if (ored & ~0x7F) {
equal = true;
for (unsigned i = 0; i != length; ++i)
equal = equal && (foldCase(as[i]) == foldCase(b[i]));
equal = equal && u_foldCase(as[i], U_FOLD_CASE_DEFAULT) == u_foldCase(b[i], U_FOLD_CASE_DEFAULT);
}
return equal && !b[length];
......@@ -1931,7 +1930,7 @@ bool equalIgnoringCase(const StringImpl* a, const LChar* b)
if (ored & ~0x7F) {
equal = true;
for (unsigned i = 0; i != length; ++i) {
equal = equal && (foldCase(as[i]) == foldCase(b[i]));
equal = equal && u_foldCase(as[i], U_FOLD_CASE_DEFAULT) == u_foldCase(b[i], U_FOLD_CASE_DEFAULT);
}
}
......@@ -1970,24 +1969,24 @@ bool equalIgnoringNullity(StringImpl* a, StringImpl* b)
return equal(a, b);
}
WTF::Unicode::Direction StringImpl::defaultWritingDirection(bool* hasStrongDirectionality)
UCharDirection StringImpl::defaultWritingDirection(bool* hasStrongDirectionality)
{
for (unsigned i = 0; i < m_length; ++i) {
WTF::Unicode::Direction charDirection = WTF::Unicode::direction(is8Bit() ? m_data8[i] : m_data16[i]);
if (charDirection == WTF::Unicode::LeftToRight) {
UCharDirection charDirection = u_charDirection(is8Bit() ? m_data8[i] : m_data16[i]);
if (charDirection == U_LEFT_TO_RIGHT) {
if (hasStrongDirectionality)
*hasStrongDirectionality = true;
return WTF::Unicode::LeftToRight;
return U_LEFT_TO_RIGHT;
}
if (charDirection == WTF::Unicode::RightToLeft || charDirection == WTF::Unicode::RightToLeftArabic) {
if (charDirection == U_RIGHT_TO_LEFT || charDirection == U_RIGHT_TO_LEFT_ARABIC) {
if (hasStrongDirectionality)
*hasStrongDirectionality = true;
return WTF::Unicode::RightToLeft;
return U_RIGHT_TO_LEFT;
}
}
if (hasStrongDirectionality)
*hasStrongDirectionality = false;
return WTF::Unicode::LeftToRight;
return U_LEFT_TO_RIGHT;
}
PassRefPtr<StringImpl> StringImpl::adopt(StringBuffer<LChar>& buffer)
......
......@@ -745,7 +745,7 @@ public:
WTF_EXPORT_STRING_API PassRefPtr<StringImpl> replace(StringImpl*, StringImpl*);
WTF_EXPORT_STRING_API PassRefPtr<StringImpl> replace(unsigned index, unsigned len, StringImpl*);
WTF_EXPORT_STRING_API WTF::Unicode::Direction defaultWritingDirection(bool* hasStrongDirectionality = 0);
WTF_EXPORT_STRING_API UCharDirection defaultWritingDirection(bool* hasStrongDirectionality = nullptr);
#if USE(CF)
RetainPtr<CFStringRef> createCFString();
......@@ -1107,7 +1107,7 @@ inline bool equalIgnoringCase(const char* a, const LChar* b, unsigned length) {
inline bool equalIgnoringCase(const UChar* a, const UChar* b, int length)
{
ASSERT(length >= 0);
return !Unicode::umemcasecmp(a, b, length);
return !u_memcasecmp(a, b, length, U_FOLD_CASE_DEFAULT);
}
WTF_EXPORT_STRING_API bool equalIgnoringCaseNonNull(const StringImpl*, const StringImpl*);
......@@ -1317,7 +1317,7 @@ static inline bool isSpaceOrNewline(UChar c)
{
// Use isASCIISpace() for basic Latin-1.
// This will include newlines, which aren't included in Unicode DirWS.
return c <= 0x7F ? WTF::isASCIISpace(c) : WTF::Unicode::direction(c) == WTF::Unicode::WhiteSpaceNeutral;
return c <= 0x7F ? isASCIISpace(c) : u_charDirection(c) == U_WHITE_SPACE_NEUTRAL;
}
template<typename CharacterType>
......
......@@ -440,13 +440,13 @@ public:
static String fromUTF8WithLatin1Fallback(const char* s, size_t length) { return fromUTF8WithLatin1Fallback(reinterpret_cast<const LChar*>(s), length); };
// Determines the writing direction using the Unicode Bidi Algorithm rules P2 and P3.
WTF::Unicode::Direction defaultWritingDirection(bool* hasStrongDirectionality = 0) const
UCharDirection defaultWritingDirection(bool* hasStrongDirectionality = nullptr) const
{
if (m_impl)
return m_impl->defaultWritingDirection(hasStrongDirectionality);
if (hasStrongDirectionality)
*hasStrongDirectionality = false;
return WTF::Unicode::LeftToRight;
return U_LEFT_TO_RIGHT;
}
bool containsOnlyASCII() const;
......
......@@ -23,218 +23,10 @@
#ifndef WTF_UNICODE_ICU_H
#define WTF_UNICODE_ICU_H
#if USE(ICU_UNICODE)
#include <stdlib.h>
#include <unicode/uchar.h>
#include <unicode/uscript.h>
#include <unicode/ustring.h>
#include <unicode/utf16.h>
namespace WTF {
namespace Unicode {
enum Direction {
LeftToRight = U_LEFT_TO_RIGHT,
RightToLeft = U_RIGHT_TO_LEFT,
EuropeanNumber = U_EUROPEAN_NUMBER,
EuropeanNumberSeparator = U_EUROPEAN_NUMBER_SEPARATOR,
EuropeanNumberTerminator = U_EUROPEAN_NUMBER_TERMINATOR,
ArabicNumber = U_ARABIC_NUMBER,
CommonNumberSeparator = U_COMMON_NUMBER_SEPARATOR,
BlockSeparator = U_BLOCK_SEPARATOR,
SegmentSeparator = U_SEGMENT_SEPARATOR,
WhiteSpaceNeutral = U_WHITE_SPACE_NEUTRAL,
OtherNeutral = U_OTHER_NEUTRAL,
LeftToRightEmbedding = U_LEFT_TO_RIGHT_EMBEDDING,
LeftToRightOverride = U_LEFT_TO_RIGHT_OVERRIDE,
RightToLeftArabic = U_RIGHT_TO_LEFT_ARABIC,
RightToLeftEmbedding = U_RIGHT_TO_LEFT_EMBEDDING,
RightToLeftOverride = U_RIGHT_TO_LEFT_OVERRIDE,
PopDirectionalFormat = U_POP_DIRECTIONAL_FORMAT,
NonSpacingMark = U_DIR_NON_SPACING_MARK,
BoundaryNeutral = U_BOUNDARY_NEUTRAL
};
enum DecompositionType {
DecompositionNone = U_DT_NONE,
DecompositionCanonical = U_DT_CANONICAL,
DecompositionCompat = U_DT_COMPAT,
DecompositionCircle = U_DT_CIRCLE,
DecompositionFinal = U_DT_FINAL,
DecompositionFont = U_DT_FONT,
DecompositionFraction = U_DT_FRACTION,
DecompositionInitial = U_DT_INITIAL,
DecompositionIsolated = U_DT_ISOLATED,
DecompositionMedial = U_DT_MEDIAL,
DecompositionNarrow = U_DT_NARROW,
DecompositionNoBreak = U_DT_NOBREAK,
DecompositionSmall = U_DT_SMALL,
DecompositionSquare = U_DT_SQUARE,
DecompositionSub = U_DT_SUB,
DecompositionSuper = U_DT_SUPER,
DecompositionVertical = U_DT_VERTICAL,
DecompositionWide = U_DT_WIDE,
};
enum CharCategory {
NoCategory = 0,
Other_NotAssigned = U_MASK(U_GENERAL_OTHER_TYPES),
Letter_Uppercase = U_MASK(U_UPPERCASE_LETTER),
Letter_Lowercase = U_MASK(U_LOWERCASE_LETTER),
Letter_Titlecase = U_MASK(U_TITLECASE_LETTER),
Letter_Modifier = U_MASK(U_MODIFIER_LETTER),
Letter_Other = U_MASK(U_OTHER_LETTER),
Mark_NonSpacing = U_MASK(U_NON_SPACING_MARK),
Mark_Enclosing = U_MASK(U_ENCLOSING_MARK),
Mark_SpacingCombining = U_MASK(U_COMBINING_SPACING_MARK),
Number_DecimalDigit = U_MASK(U_DECIMAL_DIGIT_NUMBER),
Number_Letter = U_MASK(U_LETTER_NUMBER),
Number_Other = U_MASK(U_OTHER_NUMBER),
Separator_Space = U_MASK(U_SPACE_SEPARATOR),
Separator_Line = U_MASK(U_LINE_SEPARATOR),
Separator_Paragraph = U_MASK(U_PARAGRAPH_SEPARATOR),
Other_Control = U_MASK(U_CONTROL_CHAR),
Other_Format = U_MASK(U_FORMAT_CHAR),
Other_PrivateUse = U_MASK(U_PRIVATE_USE_CHAR),
Other_Surrogate = U_MASK(U_SURROGATE),
Punctuation_Dash = U_MASK(U_DASH_PUNCTUATION),
Punctuation_Open = U_MASK(U_START_PUNCTUATION),
Punctuation_Close = U_MASK(U_END_PUNCTUATION),
Punctuation_Connector = U_MASK(U_CONNECTOR_PUNCTUATION),
Punctuation_Other = U_MASK(U_OTHER_PUNCTUATION),
Symbol_Math = U_MASK(U_MATH_SYMBOL),
Symbol_Currency = U_MASK(U_CURRENCY_SYMBOL),
Symbol_Modifier = U_MASK(U_MODIFIER_SYMBOL),
Symbol_Other = U_MASK(U_OTHER_SYMBOL),
Punctuation_InitialQuote = U_MASK(U_INITIAL_PUNCTUATION),
Punctuation_FinalQuote = U_MASK(U_FINAL_PUNCTUATION)
};
inline UChar32 foldCase(UChar32 c)
{
return u_foldCase(c, U_FOLD_CASE_DEFAULT);
}
inline int foldCase(UChar* result, int resultLength, const UChar* src, int srcLength, bool* error)
{
UErrorCode status = U_ZERO_ERROR;
int realLength = u_strFoldCase(result, resultLength, src, srcLength, U_FOLD_CASE_DEFAULT, &status);
*error = !U_SUCCESS(status);
return realLength;
}
inline int toLower(UChar* result, int resultLength, const UChar* src, int srcLength, bool* error)
{
UErrorCode status = U_ZERO_ERROR;
int realLength = u_strToLower(result, resultLength, src, srcLength, "", &status);
*error = !!U_FAILURE(status);
return realLength;
}
inline UChar32 toLower(UChar32 c)
{
return u_tolower(c);
}
inline UChar32 toUpper(UChar32 c)
{
return u_toupper(c);
}
inline int toUpper(UChar* result, int resultLength, const UChar* src, int srcLength, bool* error)
{
UErrorCode status = U_ZERO_ERROR;
int realLength = u_strToUpper(result, resultLength, src, srcLength, "", &status);
*error = !!U_FAILURE(status);
return realLength;
}
inline UChar32 toTitleCase(UChar32 c)
{
return u_totitle(c);
}
inline bool isArabicChar(UChar32 c)
{