Using XPATH to search text containing  

It seems that OpenQA, guys behind Selenium, have already addressed this problem. They defined some variables to explicitely match whitespaces. In my case, I need to use an XPATH similar to //td[text()="${nbsp}"].

I reproduced here the text from OpenQA concerning this issue (found here):

HTML automatically normalizes
whitespace within elements, ignoring
leading/trailing spaces and converting
extra spaces, tabs and newlines into a
single space. When Selenium reads text
out of the page, it attempts to
duplicate this behavior, so you can
ignore all the tabs and newlines in
your HTML and do assertions based on
how the text looks in the browser when
rendered. We do this by replacing all
non-visible whitespace (including the
non-breaking space “ “) with a
single space. All visible newlines
(<br>, <p>, and <pre> formatted
new lines) should be preserved.

We use the same normalization logic on
the text of HTML Selenese test case
tables. This has a number of
advantages. First, you don’t need to
look at the HTML source of the page to
figure out what your assertions should
be; “&nbsp;” symbols are invisible
to the end user, and so you shouldn’t
have to worry about them when writing
Selenese tests. (You don’t need to put
&nbsp;” markers in your test case
to assertText on a field that contains
&nbsp;“.) You may also put extra
newlines and spaces in your Selenese
<td> tags; since we use the same
normalization logic on the test case
as we do on the text, we can ensure
that assertions and the extracted text
will match exactly.

This creates a bit of a problem on
those rare occasions when you really
want/need to insert extra whitespace
in your test case. For example, you
may need to type text in a field like
this: “foo “. But if you simply
write <td>foo </td> in your
Selenese test case, we’ll replace your
extra spaces with just one space.

This problem has a simple workaround.
We’ve defined a variable in Selenese,
${space}, whose value is a single
space. You can use ${space} to
insert a space that won’t be
automatically trimmed, like this:
<td>foo${space}${space}${space}</td>.
We’ve also included a variable
${nbsp}, that you can use to insert
a non-breaking space.

Note that XPaths do not normalize
whitespace the way we do. If you need
to write an XPath like
//div[text()="hello world"] but the
HTML of the link is really
hello&nbsp;world“, you’ll need to
insert a real “&nbsp;” into your
Selenese test case to get it to match,
like this:
//div[text()="hello${nbsp}world"].

Leave a Comment