Closed Bug 559814 Opened 14 years ago Closed 14 years ago

Mozmill 1.4.1 fails when non ASCII characters are contained in properties or entities

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
critical

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: adriank, Assigned: harth)

References

Details

(Keywords: regression, Whiteboard: [qae-p1][mozmill-1.4.2+])

Attachments

(3 files, 2 obsolete files)

after upgrading to Mozmill 1.4.1 from 1.4, the testSecurity/testBlueLarry.js test fails on Polish 3.6.3 Firefox build as well as todays Polish 3.6.4pre nightly.
It works on the en-US builds.

After going back to 1.4 it works again, so something must be wrong in 1.4.1.

Output:

Adrian-Kallas-MacBook-Pro-15:mozmill-tests-default mozilla$ mozmill --show-errors -t firefox/testSecurity/testBlueLarry.js -b "/Applications/Firefox-pl.app/"
endRunner was never called. There must have been a failure in the framework.
This is definitely a P1. It blocks running our tests with localized builds. We should get this fixed with the next Mozmill release. Even if we have to push another 1.4 bug fix release.
Severity: normal → critical
Whiteboard: [qae-p1][mozmill-1.4.2]
I have no idea what has been caused this regression. We successfully call goQuitApplication but an exception is thrown here:

http://github.com/mikeal/mozmill/blob/master/mozmill/__init__.py#L279
Keywords: regression
Whiteboard: [qae-p1][mozmill-1.4.2] → [qae-p1][mozmill-1.4.2?]
Whiteboard: [qae-p1][mozmill-1.4.2?] → [qae-p1][mozmill-1.4.2+]
Adrian, have you had time to create a simplified test which shows this behavior?
OS: Mac OS X → All
Hardware: x86 → All
Arian, ping? If you don't have time please mention it, so someone else has to take care about it. Thanks.
I have seen the same problem now when running tests with my add-on test-run script. Trying to execute a test module which references non-existent shared modules seems to be at least one reason. Looks like we break somewhere in the collector and don't run the test function. Firefox gets correctly closed but since there are no results something goes crazy.
Attached file simplified testcase (obsolete) —
output of this test case I'm getting on Fx 3.6.3 en-US and pl:
http://pastebin.mozilla.org/728820
Attached file test case v2 (obsolete) —
"bigger" updates in this test case:
-no more errors on en-US - there also weren't errors on en-US in the original test
-getNode().value replaced by getNode().textContent
Attachment #447811 - Attachment is obsolete: true
it looks like this bug has to do something with non-ASCII characters: on locales that use for the test-string just ASCII characters, the test passes. 

Locales that passes the test found so far:
eu, en-GB, en-US
Thanks Adrian for the investigation. This bug is somehow related to bug 506760 but I'm not really sure why it has worked with 1.4 though. We already were able to compare non ASCII strings via assertProperty. No idea why my patch on bug 558404 has been caused this problem.
Summary: Mozmill 1.4.1 fails on testSecurity/testBlueLarry.js on localized builds → Mozmill 1.4.1 fails when non ASCII characters are contained in properties or entities
Assignee: nobody → harthur
Status: NEW → ASSIGNED
I can only get this to fail from the command line, so its hard to debug. Is there any way to get harness exceptions to be reported to the command line?
Wow, that's interesting. Adrian can you see the same on your box?

I would say the logging module of Mozmill could help to get information outputted to the command line. Otherwise dump to the console and check your console output on OS X.
(In reply to comment #12)
> Wow, that's interesting. Adrian can you see the same on your box?

yes, I can. Works perfectly when run from the Mozmill IDE... A strang thing, that reminds me of some similar problem I had some time ago with Silme and L10n-Checks (also Python):
Before sending any string to the console, I had to do: string.encode('utf_8')
Do we have a stack from the python end?
CTraceback (most recent call last):
  File "/usr/local/bin/mozmill", line 8, in <module>
    load_entry_point('mozmill==1.4.1', 'console_scripts', 'mozmill')()
  File "/home/heather/mozmill/mozmill/__init__.py", line 580, in cli
    CLI().run()
  File "/home/heather/mozmill/mozmill/__init__.py", line 557, in run
    self._run()
  File "/home/heather/mozmill/mozmill/__init__.py", line 536, in _run
    self.mozmill.stop()
  File "/home/heather/mozmill/mozmill/__init__.py", line 291, in stop
    sleep(1);
KeyboardInterrupt


It's just waiting for endListener, so my guess is that mozmill is throwing on the js side and never sends that event, I'm going to start digging through the js code.
(In reply to comment #15)
> CTraceback (most recent call last):
>   File "/usr/local/bin/mozmill", line 8, in <module>
>     load_entry_point('mozmill==1.4.1', 'console_scripts', 'mozmill')()
>   File "/home/heather/mozmill/mozmill/__init__.py", line 580, in cli
>     CLI().run()
>   File "/home/heather/mozmill/mozmill/__init__.py", line 557, in run
>     self._run()
>   File "/home/heather/mozmill/mozmill/__init__.py", line 536, in _run
>     self.mozmill.stop()
>   File "/home/heather/mozmill/mozmill/__init__.py", line 291, in stop
>     sleep(1);
> KeyboardInterrupt
> 
> 
> It's just waiting for endListener, so my guess is that mozmill is throwing on
> the js side and never sends that event, I'm going to start digging through the
> js code.

Does this mean that we don't get JS exceptions in mozmill (well, jsbridge)?
What's the end-point on the extension side where we would have to attach the debugger? Do you have a specific function for me, Heather?
I ran a debug build and found these seemly related assertions in the dump at the end:

###!!! ASSERTION: U+0080/U+0100 - U+FFFF data lost: 'legalRange', file /home/heather/Desktop/ffx3.6.x_dbg/mozilla-1.9.2/js/src/xpconnect/src/xpcconvert.cpp, line 794
###!!! ASSERTION: U+0080/U+0100 - U+FFFF data lost: 'legalRange', file /home/heather/Desktop/ffx3.6.x_dbg/mozilla-1.9.2/js/src/xpconnect/src/xpcconvert.cpp, line 794

bit length overflow
code 2 bits 6->7
code 6 bits 6->7
WARNING: An event was posted to a thread that will never run it (rejected): file /home/heather/Desktop/ffx3.6.x_dbg/mozilla-1.9.2/xpcom/threads/nsThread.cpp, line 362
WARNING: unable to post continuation event: file /home/heather/Desktop/ffx3.6.x_dbg/mozilla-1.9.2/xpcom/io/nsStreamUtils.cpp, line 475
WARNING: nsExceptionService ignoring thread destruction after shutdown: file /home/heather/Desktop/ffx3.6.x_dbg/mozilla-1.9.2/xpcom/base/nsExceptionService.cpp, line 194
nsXPConnect::CommenceShutdown()


Given that it's talking about converting and streams, I'm thinking that jsbridge (on the extension end) can't send the non-ascii characters over the stream, which would also explain why we're seeing this only with the command line and we saw it when we changed what we sent across jsbridge.

So, we can see if we can convert everything we send over jsbridge to ascii somehow?
Clint and I tracked it down to jsbridge. It wasn't converting to UTF-8 before sending, fix is in my encoding branch of jsbridge:

http://github.com/harthur/jsbridge/compare/encoding
Ok, so check comment 9, which I have missed here. Bug 558404 is really the cause of this problem because running an assertProperty will now include the objects value in the pass/fail string. In case of localized builds we fail. Lemme create a small simplified test which will also show this problem in en-US builds.
Blocks: 558404
Attached file Simplified test
This test is a really stripped-down version and can also be used with en-US builds. As you can see we have problems in both directions, whereby Heather was already working on a patch for the python side on bug 506760. I will check if that patch helps us already to fix one side of the problem.
Attachment #447834 - Attachment is obsolete: true
Ok, bug 506760 will not help at all for this situation, because it only acts on the content from the editor. Here we have:

* UTF-8 characters in test files are not send correctly to the browser
* UTF-8 characters retrieved from elements while the test runs are not correctly send to the Python back-end
(In reply to comment #19)
> Clint and I tracked it down to jsbridge. It wasn't converting to UTF-8 before
> sending, fix is in my encoding branch of jsbridge:
> 
> http://github.com/harthur/jsbridge/compare/encoding

I have overseen this comment. So I have good news and bad news. Which ones you wanna hear first? Ok, good news: The sending of those characters works fine now from the browser to the Python side. But there are still two open issues:

1. The string has to be decoded on the Python side so we do not show something like 'Strony powi\u0105zane:' in the test result.

2. Trying to access elements with utf8 characters in their name/id/... still doesn't work. You can still use my simplified testcase.
(In reply to comment #19)
> Clint and I tracked it down to jsbridge. It wasn't converting to UTF-8 before
> sending, fix is in my encoding branch of jsbridge:
> 
> http://github.com/harthur/jsbridge/compare/encoding

my original testcase and the testBlueLarry.js are working now with this patch and the patch from Bug 506760#c15 applied :)

Henrik's test case isn't working...
(In reply to comment #22)
> Ok, bug 506760 will not help at all for this situation, because it only acts on
> the content from the editor.

no, mozISubScriptLoader is used for command-line tests as well.
Unfortunately I don't know much about encoding, but I think the problem is in the mozISubscriptLoader. I was converting from ISO-8859-1 to unicode in the fix for bug 506760 http://github.com/harthur/mozmill/compare/master...encoding#L1R353, but I don't think that works for Polish? Maybe we have to guess which encoding they need each time we run a test, but I have no idea.
Jeff has enlightened me. If a test contains non UTF-8 characters, it will have to explicitly call out which encoding it is using.
I am not all that familiar with how JS deals with encodings (or any other language, really) so if this is useless information forgive me.  I modified attachment 453328 [details] to alert the characters:

diff testTest.js <(curl https://bug559814.bugzilla.mozilla.org/attachment.cgi?id=453328 2> /dev/null)
7,9c7
<   var link_title = "Strona główna";
<   var link = new elementslib.Link(controller.tabs.activeTab, link_title);
<   controller.window.alert(link_title);
---
>   var link = new elementslib.Link(controller.tabs.activeTab, "Strona główna");

The alert does not nicely render link_title, but instead puts some garbage.  I'll attach a screenshot and figure out more later.
(In reply to comment #29)
> Created attachment 458479 [details]
> screenshot of JS alert with the text from the applied patch

Having an HTML page with JS produces the same results.  Not sure what I expected:

<html>
<head>
<script type="text/javascript">
function foo() {
  var link_title = "Strona główna";
  alert(link_title);
}
window.onload = foo;
</script>
<head>
<body>
</body>
</html>
It could be we're not getting proper escapes the way we load files: http://tech.nigel.in/2008_02_01_archive.html

I'll see if the workaround works tomorrow.
the problem I ran into with eval() is that the second arg (the context) is deprecated now )-: Maybe there's a workaround for that though. It would be useful to check in and see what the Jetpack module loader is doing these days. I think they were using evalInSandbox and somehow managed to inject things like Components into the scope.
Why and where do we have to use eval?
To load in a test file, right now what we are using loadSubScript for.
(In reply to comment #26)
> Unfortunately I don't know much about encoding, but I think the problem is in
> the mozISubscriptLoader. I was converting from ISO-8859-1 to unicode in the fix
> for bug 506760
> http://github.com/harthur/mozmill/compare/master...encoding#L1R353, but I don't
> think that works for Polish? Maybe we have to guess which encoding they need
> each time we run a test, but I have no idea.

My current guess is that loadSubScript does not do the right thing for file:/// URIs and unicode.
We should really assess the scope of this bug.  There are a few possible things to be done:

 * change the way we load tests.  There are several different ways of loading JS in mozilla-central.  I'm currently investigating them and seeing what works.  loadSubScript may not work with file:// URIs....I don't know, I haven't been able to get an answer on that yet (FWIW, I've changed the default accept encoding to UTF-8 and that doesn't help the issue).

 * change the way we write tests.  This will certainly work if we verbosely escape unicode.  It makes it worse for test writers, which is a big minus.  OTOH, it requires no change in code.
(In reply to comment #36)
> We should really assess the scope of this bug.  There are a few possible things
> to be done:
> 
>  * change the way we load tests.  There are several different ways of loading
> JS in mozilla-central.  I'm currently investigating them and seeing what works.
>  loadSubScript may not work with file:// URIs....I don't know, I haven't been
> able to get an answer on that yet (FWIW, I've changed the default accept
> encoding to UTF-8 and that doesn't help the issue).
> 
>  * change the way we write tests.  This will certainly work if we verbosely
> escape unicode.  It makes it worse for test writers, which is a big minus. 
> OTOH, it requires no change in code.

Or to put this in a more actionable way "How much work is it worth to have tests which are unicode (UTF-8)?"
(In reply to comment #36)
>  loadSubScript may not work with file:// URIs....I don't know, I haven't been
> able to get an answer on that yet (FWIW, I've changed the default accept
> encoding to UTF-8 and that doesn't help the issue).

See also bug 377498 which eventually could block us here too when using loadSubScript.
Depends on: 377498
(In reply to comment #38)
> (In reply to comment #36)
> >  loadSubScript may not work with file:// URIs....I don't know, I haven't been
> > able to get an answer on that yet (FWIW, I've changed the default accept
> > encoding to UTF-8 and that doesn't help the issue).
> 
> See also bug 377498 which eventually could block us here too when using
> loadSubScript.

Thanks for finding that bug -- that's exactly what we're running up against.  So as far as fixing the tests, there are a few options:

 * test files can be written in/converted to iso-8859-1 which is what loadSubScript will assume they are anyway when given a file:// URI

 * test files can use unicode escapes which will always work

As far as fixing the actual issue, there are a few options.  I'll list them here in possible order of expediency:

 * Components.utils.import works with file:// URIs and (in theory) should assume utf-8 files by default.  BUT it doesn't work in quite the same way.  Specifically the issue I run up against is that after changing the code from `loader.loadSubScript(uri, module)` to `Components.utils.import(uri, module)`, I get `ERROR | Test Failure: {"exception": {"message": "mozmill is not defined" ...` regardless of whether I do `module.mozill = mozmill` before or after the import.  In addition, all tests will need to add `EXPORTED_SYMBOLS` to each test file.  Since this is a major API change, it should not be done for 1.4.2 . If anyone is better versed in the above issue with `import`, I'd love to know if it was possible.  I could not find any code that attempted to do this.

 * we could serve the files or magically wrap them in HTML (with a <meta> tag) to explicitly set the Content-type: text/javascript; charset=utf-8 ; its an ugly hack, but it would be quick.

 * a mozIJSSubScriptLoader2 could be written, as per bug 377498, either as part of Mozmill and/or as part of mozilla-central (probably both).  This would accept the charset to be loaded.

 * we could move the tests to using common.js.  This should take care of this problem, but again will break the API horribly.

 * we could read the contents of the file, verbosely specifying the encoding, and then use eval or evalInSandbox to run the tests.  This sounds "nice", but it would require an entirely different structure for the mozmill extension code. The test files would need to be eval-ed at test-running time, probably futzing with the string to get an appropriate runner-harness (etc) in them.  So boo!

Conceivably, the quick+dirty serving them via http could work for 1.4.2.  That is not necessarily a recommendation.
(In reply to comment #39)
>  * Components.utils.import works with file:// URIs and (in theory) should
> assume utf-8 files by default.  BUT it doesn't work in quite the same way. 
> Specifically the issue I run up against is that after changing the code from
> `loader.loadSubScript(uri, module)` to `Components.utils.import(uri, module)`,
> I get `ERROR | Test Failure: {"exception": {"message": "mozmill is not defined"

I tried similar things in the past, and also got a lot of issues with import. IMO we shouldn't spend time on that when we already plan to use commonJS and would have to refactor everything.

> If anyone is better versed in the above issue with `import`, I'd love to know
> if it was possible.  I could not find any code that attempted to do this.

say utils.js: Everything should be under the utils namespace and we only have to export the namespace instead of all the functions. That way we could directly import stuff in other js files with:

Components.utils.import('resource://mozmill/modules/utils.js');

instead of:

var utils = {}; Components.utils.import('resource://mozmill/modules/utils.js', utils);

If we do this cleanly it should be even safe for bugfix releases, because the scope stays the same.

>  * a mozIJSSubScriptLoader2 could be written, as per bug 377498, either as part
> of Mozmill and/or as part of mozilla-central (probably both).  This would
> accept the charset to be loaded.

Well, there are discussion to move those interfaces2 into the real interface for Firefox 4.0 at least. For older branches we would have to go this way.
(In reply to comment #40)
> (In reply to comment #39)
> >  * Components.utils.import works with file:// URIs and (in theory) should
> > assume utf-8 files by default.  BUT it doesn't work in quite the same way. 
> > Specifically the issue I run up against is that after changing the code from
> > `loader.loadSubScript(uri, module)` to `Components.utils.import(uri, module)`,
> > I get `ERROR | Test Failure: {"exception": {"message": "mozmill is not defined"
> 
> I tried similar things in the past, and also got a lot of issues with import.
> IMO we shouldn't spend time on that when we already plan to use commonJS and
> would have to refactor everything.

Sounds like a plan.  Should we move this bug to 2.0 whiteboard?  Should I upload a version of the above test that does work with our current setup? Instead of moving this ticket, we could alternatively write a new ticket, as the existing ticket is a conflation of two issues:

 * the ability to send unicode of jsbridge -- harth already fixed this
 * the ability to use utf-8 native test files, which isn't possible with our current setup (re bug 377498) but should be possible with common.js 

(I'll also note in passing that I, at least personally, don't have any desire or think its a good idea to support encodings in test files besides unicode.  While I understand the reverse argument, I imagine that it will be quite a project to support arbitrary encodings.  Maybe I'm wrong).

So what should be done next here?

> > If anyone is better versed in the above issue with `import`, I'd love to know
> > if it was possible.  I could not find any code that attempted to do this.
> 
> say utils.js: Everything should be under the utils namespace and we only have
> to export the namespace instead of all the functions. That way we could
> directly import stuff in other js files with:
> 
> Components.utils.import('resource://mozmill/modules/utils.js');
> 
> instead of:
> 
> var utils = {}; Components.utils.import('resource://mozmill/modules/utils.js',
> utils);
> 
> If we do this cleanly it should be even safe for bugfix releases, because the
> scope stays the same.
> 
> >  * a mozIJSSubScriptLoader2 could be written, as per bug 377498, either as part
> > of Mozmill and/or as part of mozilla-central (probably both).  This would
> > accept the charset to be loaded.
> 
> Well, there are discussion to move those interfaces2 into the real interface
> for Firefox 4.0 at least. For older branches we would have to go this way.

Not sure I understand this comment.  the component is not yet written (either in mozmill or m-c).  What is meant here?
(In reply to comment #41)
>  * the ability to use utf-8 native test files, which isn't possible with our
> current setup (re bug 377498) but should be possible with common.js 

I'm not that familar with all that encoding stuff but all of our tests have UTF-8 encoding. It only fails for characters which are outside of the ASCII scope.

> (I'll also note in passing that I, at least personally, don't have any desire
> or think its a good idea to support encodings in test files besides unicode. 
> While I understand the reverse argument, I imagine that it will be quite a
> project to support arbitrary encodings.  Maybe I'm wrong).

I agree. We should only work with UTF-8 encoding.

> So what should be done next here?

Get harthurs patch reviewed and checked in. Then we can see what's working and what not. That's fair?

> > >  * a mozIJSSubScriptLoader2 could be written, as per bug 377498, either as part
> > > of Mozmill and/or as part of mozilla-central (probably both).  This would
> > > accept the charset to be loaded.
> > 
> > Well, there are discussion to move those interfaces2 into the real interface
> > for Firefox 4.0 at least. For older branches we would have to go this way.
> 
> Not sure I understand this comment.  the component is not yet written (either
> in mozmill or m-c).  What is meant here?

On trunk we have broken a couple of frozen interfaces. That's why we have bumped the Gecko version to 2.0. An idea is to move all the interfaces2 into interfaces now. So if we would like to have that functionality it would be the best time now. But for older branches we would still need the mozIJSSubScriptLoader2 interface and it would have to be approved for landing.
(In reply to comment #42)
> (In reply to comment #41)
> >  * the ability to use utf-8 native test files, which isn't possible with our
> > current setup (re bug 377498) but should be possible with common.js 
> 
> I'm not that familar with all that encoding stuff but all of our tests have
> UTF-8 encoding. It only fails for characters which are outside of the ASCII
> scope.

Sure, but this is a choice of what we support.
 
> > (I'll also note in passing that I, at least personally, don't have any desire
> > or think its a good idea to support encodings in test files besides unicode. 
> > While I understand the reverse argument, I imagine that it will be quite a
> > project to support arbitrary encodings.  Maybe I'm wrong).
> 
> I agree. We should only work with UTF-8 encoding.
> 
> > So what should be done next here?
> 
> Get harthurs patch reviewed and checked in. Then we can see what's working and
> what not. That's fair?

Sure.  It won't fix this bug though because of bug 377498.  The failure is at the JS loader level.

> > > >  * a mozIJSSubScriptLoader2 could be written, as per bug 377498, either as part
> > > > of Mozmill and/or as part of mozilla-central (probably both).  This would
> > > > accept the charset to be loaded.
> > > 
> > > Well, there are discussion to move those interfaces2 into the real interface
> > > for Firefox 4.0 at least. For older branches we would have to go this way.
> > 
> > Not sure I understand this comment.  the component is not yet written (either
> > in mozmill or m-c).  What is meant here?
> 
> On trunk we have broken a couple of frozen interfaces. That's why we have
> bumped the Gecko version to 2.0. An idea is to move all the interfaces2 into
> interfaces now. So if we would like to have that functionality it would be the
> best time now. But for older branches we would still need the
> mozIJSSubScriptLoader2 interface and it would have to be approved for landing.

I guess I don't understand the flow.  mozIJSSubscriptLoader2 does not yet exist.  So how can mozIJSSubscriptLoader be deprecated for 2.0?
(In reply to comment #43)
> > I'm not that familar with all that encoding stuff but all of our tests have
> > UTF-8 encoding. It only fails for characters which are outside of the ASCII
> > scope.
> 
> Sure, but this is a choice of what we support.

Sure, but as long as none of those non-ASCII characters are in our files, we don't have a problem.

> I guess I don't understand the flow.  mozIJSSubscriptLoader2 does not yet
> exist.  So how can mozIJSSubscriptLoader be deprecated for 2.0?

I don't talk about this interface but in general. Any *2 interface is an addition of the base interface. That's necessary for frozen interfaces, because we can't change those. Now with the major dump any *2 interface could be merged with the base interface. That said, we wouldn't eventually have to introduce a new mozIJSSubscriptLoader2 interface. That's all. Any other questions we can solve separately. :)
This seems to be work-aroundable by using escaped unicode characters, as illustrated in this attachment (a modification of the previous attachment)
With the fix on bug 506760 and with Adrian's changes to the test system in bug 559836, we can pass tests that use utf-8 characters.  However, due to the old subscript loader bug, 377498, you will have to escape unicode characters used directly in Mozmill tests until we move to a different script loading paradigm in Mozmill 2.0.

Therefore, due to the fixes in place and due to the workaround and the inability of us to fix 377498 on all branches we are concerned about, I am marking this bug as WONTFIX since I believe that we have done all we can reasonably do.  We will open a new bug for the new script loading paradigm when we start that work for Mozmill 2.0.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
Well, this bug is not about the encoding of tests but about the transfer of UTF-8 characters via jsbridge, which get thrown when you run assertProperty against properties or entities.

Let's update the bugs to lower the confusion. Bug 506760 is the one which needs to be wontfix'ed.

Landed:
* master:
http://github.com/mozautomation/mozmill/commit/5fed21f2bf5dfccda675ed90442a3811511de366
* 1.4.2:
http://github.com/mozautomation/mozmill/commit/16475539c55aff658f7eed6990067a3c4f661c56
Resolution: WONTFIX → FIXED
Wow, that's pretty awesome. I tried it with testBlueLarry.js by modifying the property we check against and what a surprise everything worked very well. Even the report up on brasstacks in the mozmill-test database for the new dashboard shows the correct result:

http://brasstacks.mozilla.com/couchdb/mozmill-test/_design/dashboard/_show/report/4033d53a16dc6a7072fa65a9d00d633f

Lets call it verified fixed!!
Status: RESOLVED → VERIFIED
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: