Commit adf5c3c2 authored by dbates@webkit.org's avatar dbates@webkit.org

XSS filter bypass via non-standard URL encoding

https://bugs.webkit.org/show_bug.cgi?id=66588

Reviewed by Adam Barth.

Source/WebCore: 

Tests: http/tests/security/xssAuditor/script-tag-with-16bit-unicode-surrogate-pair.html
       http/tests/security/xssAuditor/script-tag-with-16bit-unicode.html
       http/tests/security/xssAuditor/script-tag-with-16bit-unicode2.html
       http/tests/security/xssAuditor/script-tag-with-16bit-unicode3.html
       http/tests/security/xssAuditor/script-tag-with-16bit-unicode4.html
       http/tests/security/xssAuditor/script-tag-with-16bit-unicode5.html
       http/tests/security/xssAuditor/script-tag-with-three-times-url-encoded-16bit-unicode.html
       http/tests/security/xssAuditor/window-open-without-url-should-not-assert.html

Implement support for decoding non-standard 16-bit Unicode escape sequences of
the form %u26C4 as described in <http://www.w3.org/International/iri-edit/draft-duerst-iri.html#anchor29>.

See also <http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementations>.

* GNUmakefile.list.am: Added DecodeEscapeSequences.h.
* WebCore.gypi: Ditto.
* WebCore.pro: Ditto.
* WebCore.vcproj/WebCore.vcproj: Ditto.
* WebCore.xcodeproj/project.pbxproj: Ditto.
* html/parser/XSSAuditor.cpp:
(WebCore::decode16BitUnicodeEscapeSequences): Added.
(WebCore::decodeStandardURLEscapeSequences): Added.
(WebCore::fullyDecodeString): Modified to call decode16BitUnicodeEscapeSequences().
(WebCore::XSSAuditor::init): Modified to return early when the URL of the document
is the empty string. This can happen when opening a new browser window or calling
window.open("").
* platform/KURL.cpp:
(WebCore::decodeURLEscapeSequences): Abstracted code into template-function decodeEscapeSequences().
This function just calls decodeEscapeSequences<URLEscapeSequence>().
* platform/text/DecodeEscapeSequences.h: Added.
(WebCore::Unicode16BitEscapeSequence::findInString):
(WebCore::Unicode16BitEscapeSequence::matchStringPrefix):
(WebCore::Unicode16BitEscapeSequence::decodeRun):
(WebCore::URLEscapeSequence::findInString):
(WebCore::URLEscapeSequence::matchStringPrefix):
(WebCore::URLEscapeSequence::decodeRun):
(WebCore::decodeEscapeSequences):

LayoutTests: 

Add tests for decoding non-standard 16-bit Unicode escape sequences.

Also add a test to ensure that we don't cause an assertion failure when
calling window.open("").

* http/tests/security/xssAuditor/resources/echo-intertag-decode-16bit-unicode.pl: Added.
(isUTF16Surrogate):
(decodeRunOf16BitUnicodeEscapeSequences):
(decode16BitUnicodeEscapeSequences):
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode-surrogate-pair-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode-surrogate-pair.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode2-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode2.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode3-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode3.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode4-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode4.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode5-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode5.html: Added.
* http/tests/security/xssAuditor/script-tag-with-fancy-unicode-expected.txt: Updated expected
result since we now pass this test. We should rename this file to something more descriptive,
see <https://bugs.webkit.org/show_bug.cgi?id=67818>.
* http/tests/security/xssAuditor/script-tag-with-three-times-url-encoded-16bit-unicode-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-three-times-url-encoded-16bit-unicode.html: Added.
* http/tests/security/xssAuditor/window-open-without-url-should-not-assert-expected.txt: Added.
* http/tests/security/xssAuditor/window-open-without-url-should-not-assert.html: Added.


git-svn-id: http://svn.webkit.org/repository/webkit/trunk@94828 268f45cc-cd09-0410-ab3c-d52691b4dbfc
parent 4fc0052d
2011-09-08 Daniel Bates <dbates@webkit.org>
XSS filter bypass via non-standard URL encoding
https://bugs.webkit.org/show_bug.cgi?id=66588
Reviewed by Adam Barth.
Add tests for decoding non-standard 16-bit Unicode escape sequences.
Also add a test to ensure that we don't cause an assertion failure when
calling window.open("").
* http/tests/security/xssAuditor/resources/echo-intertag-decode-16bit-unicode.pl: Added.
(isUTF16Surrogate):
(decodeRunOf16BitUnicodeEscapeSequences):
(decode16BitUnicodeEscapeSequences):
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode-surrogate-pair-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode-surrogate-pair.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode2-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode2.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode3-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode3.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode4-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode4.html: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode5-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-16bit-unicode5.html: Added.
* http/tests/security/xssAuditor/script-tag-with-fancy-unicode-expected.txt: Updated expected
result since we now pass this test. We should rename this file to something more descriptive,
see <https://bugs.webkit.org/show_bug.cgi?id=67818>.
* http/tests/security/xssAuditor/script-tag-with-three-times-url-encoded-16bit-unicode-expected.txt: Added.
* http/tests/security/xssAuditor/script-tag-with-three-times-url-encoded-16bit-unicode.html: Added.
* http/tests/security/xssAuditor/window-open-without-url-should-not-assert-expected.txt: Added.
* http/tests/security/xssAuditor/window-open-without-url-should-not-assert.html: Added.
2011-09-08 Fumitoshi Ukai <ukai@chromium.org>
Unreviewed. Chromium rebaseline of css3/bdi-element.html
#!/usr/bin/perl -wT
use strict;
use CGI;
use Encode;
my $cgi = new CGI;
use constant Unicode16BitEscapeSequenceLength => 6; # e.g. %u26C4
my $unicode16BitEscapeSequenceRegEx = qr#%u([0-9A-Za-z]{1,4})#;
sub isUTF16Surrogate($)
{
my ($number) = @_;
return $number >= 0xD800 && $number <= 0xDFFF;
}
# See <http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementations>.
sub decodeRunOf16BitUnicodeEscapeSequences($)
{
my ($string) = @_;
my @codeUnits = grep { length($_) } split(/$unicode16BitEscapeSequenceRegEx/, $string);
my $i = 0;
my $decodedRun = "";
while ($i < @codeUnits) {
# FIXME: We fallback to the UTF-8 character if we don't receive a proper high and low surrogate pair.
# Instead, we should add error handling to detect high/low surrogate mismatches and sequences
# of the form low surrogate then high surrogate.
my $hexDigitValueOfPossibleHighSurrogate = hex($codeUnits[$i]);
if (isUTF16Surrogate($hexDigitValueOfPossibleHighSurrogate) && $i + 1 < @codeUnits) {
my $hexDigitValueOfPossibleLowSurrogate = hex($codeUnits[$i + 1]);
if (isUTF16Surrogate($hexDigitValueOfPossibleLowSurrogate)) {
$decodedRun .= decode("UTF-16LE", pack("S2", $hexDigitValueOfPossibleHighSurrogate, $hexDigitValueOfPossibleLowSurrogate));
$i += 2;
next;
}
}
$decodedRun .= chr($hexDigitValueOfPossibleHighSurrogate);
$i += 1;
}
return $decodedRun;
}
sub decode16BitUnicodeEscapeSequences
{
my ($string) = @_;
my $stringLength = length($string);
my $searchPosition = 0;
my $encodedRunPosition = 0;
my $decodedPosition = 0;
my $result = "";
while (($encodedRunPosition = index($string, "%u", $searchPosition)) >= 0) {
my $encodedRunEndPosition = $encodedRunPosition;
while ($stringLength - $encodedRunEndPosition >= Unicode16BitEscapeSequenceLength
&& substr($string, $encodedRunEndPosition, Unicode16BitEscapeSequenceLength) =~ /$unicode16BitEscapeSequenceRegEx/) {
$encodedRunEndPosition += Unicode16BitEscapeSequenceLength;
}
$searchPosition = $encodedRunEndPosition;
if ($encodedRunEndPosition == $encodedRunPosition) {
++$searchPosition;
next;
}
$result .= substr($string, $decodedPosition, $encodedRunPosition - $decodedPosition);
$result .= decodeRunOf16BitUnicodeEscapeSequences(substr($string, $encodedRunPosition, $encodedRunEndPosition - $encodedRunPosition));
$decodedPosition = $encodedRunEndPosition;
}
$result .= substr($string, $decodedPosition);
return $result;
}
print "Content-Type: text/html; charset=UTF-8\n\n";
print "<!DOCTYPE html>\n";
print "<html>\n";
print "<body>\n";
print decode16BitUnicodeEscapeSequences($cgi->param('q'));
print "</body>\n";
print "</html>\n";
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
}
</script>
</head>
<body>
<iframe src="http://localhost:8000/security/xssAuditor/resources/echo-intertag.pl?q=<script>alert(/XS%uD834%uDD1E/)</script>">
</iframe>
</body>
</html>
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
}
</script>
</head>
<body>
<iframe src="http://localhost:8000/security/xssAuditor/resources/echo-intertag-decode-16bit-unicode.pl?q=%25u003c%25u0073%25u0063%25u0072%25u0069%25u0070%25u0074%25u003e%25u0061%25u006c%25u0065%25u0072%25u0074%25u0028%25u002f%25u0058%25u0053%25u0053%25u002f%25u0029%25u003c%25u002f%25u0073%25u0063%25u0072%25u0069%25u0070%25u0074%25u003e">
</iframe>
</body>
</html>
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
}
</script>
</head>
<body>
<iframe src="http://localhost:8000/security/xssAuditor/resources/echo-intertag-decode-16bit-unicode.pl?q=<script>alert(/XS%u002525u0053/)</script>">
</iframe>
</body>
</html>
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
}
</script>
</head>
<body>
<iframe src="http://localhost:8000/security/xssAuditor/resources/echo-intertag-decode-16bit-unicode.pl?q=%25u003c%25u0073%25u0063%25u0072%25u0069%25u0070%25u0074%25u003e%25u0061%25u006c%25u0065%25u0072%25u0074%25u0028%25u002f%25u0058%25u0053%25u0053%25u2620%25u002f%25u0029%25u003c%25u002f%25u0073%25u0063%25u0072%25u0069%25u0070%25u0074%25u003e">
</iframe>
</body>
</html>
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
}
</script>
</head>
<body>
<iframe src="http://localhost:8000/security/xssAuditor/resources/echo-intertag-decode-16bit-unicode.pl?q=<script>alert('%u0058%u0053%u0053%u0020%u05d0%u05d1%u05d8%u05d7%u05d4%u0020%u05e4%u05d2%u05d9%u05e2%u05d5%u05ea-%u8de8%u7ad9%u5f0f%u811a%u672c%u653b%u51fb')</script>">
</iframe>
</body>
</html>
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
}
</script>
</head>
<body>
<iframe src="http://localhost:8000/security/xssAuditor/resources/echo-intertag.pl?q=<script>alert('%u0058%u0053%u0053%u0020%u05d0%u05d1%u05d8%u05d7%u05d4%u0020%u05e4%u05d2%u05d9%u05e2%u05d5%u05ea-%u8de8%u7ad9%u5f0f%u811a%u672c%u653b%u51fb')</script>">
</iframe>
</body>
</html>
ALERT: /XSS/
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
CONSOLE MESSAGE: line 1: Refused to execute a JavaScript script. Source code of script found within request.
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
}
</script>
</head>
<body>
<iframe src="http://localhost:8000/security/xssAuditor/resources/echo-intertag.pl?q=<script>%252525u0061lert(/XSS/)</script>">
</iframe>
</body>
</html>
This test PASSED if we don't trigger an assertion failure when opening a pop-up window without a URL. To run this test by hand, ensure that pop-up windows aren't blocked before loading this page.
PASSED
<!DOCTYPE html>
<html>
<head>
<script>
if (window.layoutTestController) {
layoutTestController.dumpAsText();
layoutTestController.setXSSAuditorEnabled(true);
layoutTestController.setCanOpenWindows();
layoutTestController.setCloseRemainingWindowsWhenComplete(true);
layoutTestController.waitUntilDone();
}
</script>
</head>
<body>
<p>This test PASSED if we don't trigger an assertion failure when opening a pop-up window without a URL. To run this test by hand, ensure that pop-up windows aren't blocked before loading this page.</p>
<pre id="console"></pre>
<script>
function finish()
{
document.getElementById("console").innerText = "PASSED";
if (window.layoutTestController)
layoutTestController.notifyDone();
}
function runTest()
{
var childWindow = window.open("");
if (!childWindow) {
document.getElementById("console").innerText = "FAILED to open pop-up window. Ensure that pop-up windows aren't blocked.";
return;
}
childWindow.document.open();
childWindow.document.write("PASSED");
<!-- Break up the HTML Script Element so it is not interpreted by HTML4 parsers as per <http://www.w3.org/TR/html4/types.html#type-cdata>. -->
childWindow.document.write("<scr" + "ipt>window.opener.finish()<" + "/script>");
childWindow.document.close();
}
runTest();
</script>
</body>
</html>
2011-09-08 Daniel Bates <dbates@webkit.org>
XSS filter bypass via non-standard URL encoding
https://bugs.webkit.org/show_bug.cgi?id=66588
Reviewed by Adam Barth.
Tests: http/tests/security/xssAuditor/script-tag-with-16bit-unicode-surrogate-pair.html
http/tests/security/xssAuditor/script-tag-with-16bit-unicode.html
http/tests/security/xssAuditor/script-tag-with-16bit-unicode2.html
http/tests/security/xssAuditor/script-tag-with-16bit-unicode3.html
http/tests/security/xssAuditor/script-tag-with-16bit-unicode4.html
http/tests/security/xssAuditor/script-tag-with-16bit-unicode5.html
http/tests/security/xssAuditor/script-tag-with-three-times-url-encoded-16bit-unicode.html
http/tests/security/xssAuditor/window-open-without-url-should-not-assert.html
Implement support for decoding non-standard 16-bit Unicode escape sequences of
the form %u26C4 as described in <http://www.w3.org/International/iri-edit/draft-duerst-iri.html#anchor29>.
See also <http://en.wikipedia.org/wiki/Percent-encoding#Non-standard_implementations>.
* GNUmakefile.list.am: Added DecodeEscapeSequences.h.
* WebCore.gypi: Ditto.
* WebCore.pro: Ditto.
* WebCore.vcproj/WebCore.vcproj: Ditto.
* WebCore.xcodeproj/project.pbxproj: Ditto.
* html/parser/XSSAuditor.cpp:
(WebCore::decode16BitUnicodeEscapeSequences): Added.
(WebCore::decodeStandardURLEscapeSequences): Added.
(WebCore::fullyDecodeString): Modified to call decode16BitUnicodeEscapeSequences().
(WebCore::XSSAuditor::init): Modified to return early when the URL of the document
is the empty string. This can happen when opening a new browser window or calling
window.open("").
* platform/KURL.cpp:
(WebCore::decodeURLEscapeSequences): Abstracted code into template-function decodeEscapeSequences().
This function just calls decodeEscapeSequences<URLEscapeSequence>().
* platform/text/DecodeEscapeSequences.h: Added.
(WebCore::Unicode16BitEscapeSequence::findInString):
(WebCore::Unicode16BitEscapeSequence::matchStringPrefix):
(WebCore::Unicode16BitEscapeSequence::decodeRun):
(WebCore::URLEscapeSequence::findInString):
(WebCore::URLEscapeSequence::matchStringPrefix):
(WebCore::URLEscapeSequence::decodeRun):
(WebCore::decodeEscapeSequences):
2011-09-08 Adam Barth <abarth@webkit.org>
DocumentWriter::deprecatedFrameEncoding doesn't need to refert to Settings
......@@ -2821,6 +2821,7 @@ webcore_sources += \
Source/WebCore/platform/text/BidiContext.h \
Source/WebCore/platform/text/BidiResolver.h \
Source/WebCore/platform/text/BidiRunList.h \
Source/WebCore/platform/text/DecodeEscapeSequences.h \
Source/WebCore/platform/text/Hyphenation.cpp \
Source/WebCore/platform/text/Hyphenation.h \
Source/WebCore/platform/text/LineBreakIteratorPoolICU.h \
......
......@@ -846,6 +846,7 @@
'platform/text/BidiRunList.h',
'platform/text/BidiContext.h',
'platform/text/BidiResolver.h',
'platform/text/DecodeEscapeSequences.h',
'platform/text/LineBreakIteratorPoolICU.h',
'platform/text/LineEnding.h',
'platform/text/PlatformString.h',
......
......@@ -2112,6 +2112,7 @@ HEADERS += \
platform/sql/SQLValue.h \
platform/text/Base64.h \
platform/text/BidiContext.h \
platform/text/DecodeEscapeSequences.h \
platform/text/Hyphenation.h \
platform/text/QuotedPrintable.h \
platform/text/qt/TextCodecQt.h \
......
......@@ -31548,6 +31548,10 @@
RelativePath="..\platform\text\BidiRunList.h"
>
</File>
<File
RelativePath="..\platform\text\DecodeEscapeSequences.h"
>
</File>
<File
RelativePath="..\platform\text\Hyphenation.h"
>
......@@ -5449,6 +5449,7 @@
CE54FD381016D9A6008B44C8 /* ScriptSourceProvider.h in Headers */ = {isa = PBXBuildFile; fileRef = CE54FD371016D9A6008B44C8 /* ScriptSourceProvider.h */; settings = {ATTRIBUTES = (Private, ); }; };
CEA3949C11D45CDA003094CF /* StaticHashSetNodeList.cpp in Sources */ = {isa = PBXBuildFile; fileRef = CEA3949A11D45CDA003094CF /* StaticHashSetNodeList.cpp */; };
CEA3949D11D45CDA003094CF /* StaticHashSetNodeList.h in Headers */ = {isa = PBXBuildFile; fileRef = CEA3949B11D45CDA003094CF /* StaticHashSetNodeList.h */; };
CECCFC3B141973D5002A0AC1 /* DecodeEscapeSequences.h in Headers */ = {isa = PBXBuildFile; fileRef = CECCFC3A141973D5002A0AC1 /* DecodeEscapeSequences.h */; };
CEF418CE1179678C009D112C /* ViewportArguments.cpp in Sources */ = {isa = PBXBuildFile; fileRef = CEF418CC1179678C009D112C /* ViewportArguments.cpp */; };
CEF418CF1179678C009D112C /* ViewportArguments.h in Headers */ = {isa = PBXBuildFile; fileRef = CEF418CD1179678C009D112C /* ViewportArguments.h */; settings = {ATTRIBUTES = (Private, ); }; };
D000EBA211BDAFD400C47726 /* FrameLoaderStateMachine.cpp in Sources */ = {isa = PBXBuildFile; fileRef = D000EBA011BDAFD400C47726 /* FrameLoaderStateMachine.cpp */; };
......@@ -12208,6 +12209,7 @@
CE54FD371016D9A6008B44C8 /* ScriptSourceProvider.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ScriptSourceProvider.h; sourceTree = "<group>"; };
CEA3949A11D45CDA003094CF /* StaticHashSetNodeList.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = StaticHashSetNodeList.cpp; sourceTree = "<group>"; };
CEA3949B11D45CDA003094CF /* StaticHashSetNodeList.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = StaticHashSetNodeList.h; sourceTree = "<group>"; };
CECCFC3A141973D5002A0AC1 /* DecodeEscapeSequences.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = DecodeEscapeSequences.h; sourceTree = "<group>"; };
CEF418CC1179678C009D112C /* ViewportArguments.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = ViewportArguments.cpp; sourceTree = "<group>"; };
CEF418CD1179678C009D112C /* ViewportArguments.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ViewportArguments.h; sourceTree = "<group>"; };
D000EBA011BDAFD400C47726 /* FrameLoaderStateMachine.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = FrameLoaderStateMachine.cpp; sourceTree = "<group>"; };
......@@ -18384,6 +18386,7 @@
B2C3D9F30D006C1D00EF6F26 /* BidiContext.h */,
B2C3D9F40D006C1D00EF6F26 /* BidiResolver.h */,
A8C402921348B2220063F1E5 /* BidiRunList.h */,
CECCFC3A141973D5002A0AC1 /* DecodeEscapeSequences.h */,
375CD231119D43C800A2A859 /* Hyphenation.h */,
A5ABB78613B904BC00F197E3 /* LineBreakIteratorPoolICU.h */,
89B5EA9F11E8003D00F2367E /* LineEnding.cpp */,
......@@ -23476,6 +23479,7 @@
1A927FD21416A15B003A83C8 /* npapi.h in Headers */,
1A927FD31416A15B003A83C8 /* npruntime.h in Headers */,
1A927FD41416A15B003A83C8 /* nptypes.h in Headers */,
CECCFC3B141973D5002A0AC1 /* DecodeEscapeSequences.h in Headers */,
);
runOnlyForDeploymentPostprocessing = 0;
};
/*
* Copyright (C) 2011 Adam Barth. All Rights Reserved.
* Copyright (C) 2011 Daniel Bates (dbates@intudata.com).
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
......@@ -28,6 +29,7 @@
#include "Console.h"
#include "DOMWindow.h"
#include "DecodeEscapeSequences.h"
#include "Document.h"
#include "DocumentLoader.h"
#include "Frame.h"
......@@ -115,20 +117,29 @@ static bool containsJavaScriptURL(const Vector<UChar, 32>& value)
return equalIgnoringCase(value.data() + i, javaScriptScheme, lengthOfJavaScriptScheme);
}
static inline String decode16BitUnicodeEscapeSequences(const String& string)
{
// Note, the encoding is ignored since each %u-escape sequence represents a UTF-16 code unit.
return decodeEscapeSequences<Unicode16BitEscapeSequence>(string, UTF8Encoding());
}
static inline String decodeStandardURLEscapeSequences(const String& string, const TextEncoding& encoding)
{
// We use decodeEscapeSequences() instead of decodeURLEscapeSequences() (declared in KURL.h) to
// avoid platform-specific URL decoding differences (e.g. KURLGoogle).
return decodeEscapeSequences<URLEscapeSequence>(string, encoding);
}
static String fullyDecodeString(const String& string, const TextResourceDecoder* decoder)
{
const TextEncoding& encoding = decoder ? decoder->encoding() : UTF8Encoding();
size_t oldWorkingStringLength;
String workingString = string;
do {
oldWorkingStringLength = workingString.length();
workingString = decodeURLEscapeSequences(workingString);
workingString = decode16BitUnicodeEscapeSequences(decodeStandardURLEscapeSequences(workingString, encoding));
} while (workingString.length() < oldWorkingStringLength);
if (decoder) {
CString workingStringUTF8 = workingString.utf8();
String decodedString = decoder->encoding().decode(workingStringUTF8.data(), workingStringUTF8.length());
if (!decodedString.isEmpty())
workingString = decodedString;
}
ASSERT(!workingString.isEmpty());
workingString.replace('+', ' ');
workingString = canonicalize(workingString);
return workingString;
......@@ -169,6 +180,12 @@ void XSSAuditor::init()
const KURL& url = m_parser->document()->url();
if (url.isEmpty()) {
// The URL can be empty when opening a new browser window or calling window.open("").
m_isEnabled = false;
return;
}
if (url.protocolIsData()) {
m_isEnabled = false;
return;
......
......@@ -26,6 +26,7 @@
#include "config.h"
#include "KURL.h"
#include "DecodeEscapeSequences.h"
#include "TextEncoding.h"
#include <stdio.h>
#include <wtf/HashMap.h>
......@@ -251,14 +252,6 @@ static inline bool isSchemeCharacterMatchIgnoringCase(char character, char schem
return (character | 0x20) == schemeCharacter;
}
static inline int hexDigitValue(UChar c)
{
ASSERT(isASCIIHexDigit(c));
if (c < 'A')
return c - '0';
return (c - 'A' + 10) & 0xF; // handle both upper and lower case without a branch
}
// Copies the source to the destination, assuming all the source characters are
// ASCII. The destination buffer must be large enough. Null characters are allowed
// in the source string, and no attempt is made to null-terminate the result.
......@@ -933,59 +926,14 @@ String KURL::deprecatedString() const
return result.toString();
}
String decodeURLEscapeSequences(const String& str)
String decodeURLEscapeSequences(const String& string)
{
return decodeURLEscapeSequences(str, UTF8Encoding());
return decodeEscapeSequences<URLEscapeSequence>(string, UTF8Encoding());
}
String decodeURLEscapeSequences(const String& str, const TextEncoding& encoding)
String decodeURLEscapeSequences(const String& string, const TextEncoding& encoding)
{
StringBuilder result;
CharBuffer buffer;
unsigned length = str.length();
unsigned decodedPosition = 0;
unsigned searchPosition = 0;
size_t encodedRunPosition;
while ((encodedRunPosition = str.find('%', searchPosition)) != notFound) {
// Find the sequence of %-escape codes.
unsigned encodedRunEnd = encodedRunPosition;
while (length - encodedRunEnd >= 3
&& str[encodedRunEnd] == '%'
&& isASCIIHexDigit(str[encodedRunEnd + 1])
&& isASCIIHexDigit(str[encodedRunEnd + 2]))
encodedRunEnd += 3;
searchPosition = encodedRunEnd;
if (encodedRunEnd == encodedRunPosition) {
++searchPosition;
continue;
}
// Decode the %-escapes into bytes.
unsigned runLength = (encodedRunEnd - encodedRunPosition) / 3;
buffer.resize(runLength);
char* p = buffer.data();
const UChar* q = str.characters() + encodedRunPosition;
for (unsigned i = 0; i < runLength; ++i) {
*p++ = (hexDigitValue(q[1]) << 4) | hexDigitValue(q[2]);
q += 3;
}
// Decode the bytes into Unicode characters.
String decoded = (encoding.isValid() ? encoding : UTF8Encoding()).decode(buffer.data(), p - buffer.data());
if (decoded.isEmpty())
continue;
// Build up the string with what we just skipped and what we just decoded.
result.append(str.characters() + decodedPosition, encodedRunPosition - decodedPosition);
result.append(decoded);
decodedPosition = encodedRunEnd;
}
result.append(str.characters() + decodedPosition, length - decodedPosition);
return result.toString();
return decodeEscapeSequences<URLEscapeSequence>(string, encoding);
}
// Caution: This function does not bounds check.
......
/*
* Copyright (C) 2011 Daniel Bates (dbates@intudata.com). All Rights Reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
* CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
* EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
* PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR