Skip to content

bpo-32476 : Add concat function for ElementTree find#5251

Closed
jjolly wants to merge 4 commits intopython:mainfrom
jjolly:etree_xpath_concat
Closed

bpo-32476 : Add concat function for ElementTree find#5251
jjolly wants to merge 4 commits intopython:mainfrom
jjolly:etree_xpath_concat

Conversation

@jjolly
Copy link
Contributor

@jjolly jjolly commented Jan 20, 2018

Problem: Using apostrophes and quotes in a search string is impossible. Currently escaping quotes in the search string is not implemented.
Example:

>>> import xml.etree.ElementTree as et
>>> e = et.Element("root")
>>> c = et.SubElement(e, "branch")
>>> c.set("name", "main")
>>> gc = et.SubElement(c, "leaf")
>>> gc.set("name", "\"quote\" 'apostrophe'")
>>> et.tostring(e)
b'<root><branch name="main"><leaf name="&quot;quote&quot; \'apostrophe\'" /></branch></root>'
>>> r.findall(".//*[@name=\"&quot;quote&quot; 'apostrophe'\"]")
[]

Solution: This patch implements the XPath concat() function within an attribute search.
Result:

>>> r.findall(".//*[@name=concat('\"quote\"', ' ', \"'apostrophe'\")]")
[<Element 'leaf' at 0x7f8f9778cae8>]

https://bugs.python.org/issue32476

@matrixise
Copy link
Member

maybe you could add a

..versionadded:: 3.7
    Support of the XPath concat function

@jjolly
Copy link
Contributor Author

jjolly commented Jan 30, 2018

@matrixise How about full documentation on the new feature - including the versionadd?

@matrixise
Copy link
Member

Fine, you have updated the documentation. Thank you

r"\.\.|"
r"\(\)|"
r"[/.*:\[\]\(\)@=])|"
r"[/.,*:\[\]\(\)@=])|"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a comma here suggests that you should also add error tests for commas appearing in some unusual places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does not seem to be a precedence for such testing. Is there a similar test for unusually placed '@' or '=' tokens? I could implement a test such as these:

        self.assertRaisesRegex(KeyError, ',', e.find, ',/tag[0]')
        self.assertRaisesRegex(KeyError, ',', e.find, '.,tag[0]')
        self.assertRaisesRegex(KeyError, ',', e.find, './,tag[0]')
        self.assertRaisesRegex(KeyError, ',', e.find, '.,/tag[0]')

If I added these tests, wouldn't it be necessary to add similar tests for the other characters in the regex? Do I test for every character in the regex? How far do I go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the other characters are part of the general tree structure and thus allowed in more places. All of these places are unlikely to be prepared for handling a ,, so this is a new case. I'd specifically recommend to add tests for things like ,[0], tag[,], tag[(,)] and so on.

I would generally say that covering the parser with more tests would be helpful, but you don't have to do that for this ticket. Since you're at it, though, feel free to add a couple of similar ones for =, : and @. You can do that in a loop, as long as the error message on failures is clear about what incorrect expression was tried.

@jjolly jjolly force-pushed the etree_xpath_concat branch from cb286b0 to 67c0d33 Compare February 19, 2019 04:56
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"closing parenthesis"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"preceding"

@scoder
Copy link
Contributor

scoder commented Mar 4, 2019

One more thing: there should also be tests that it's still possible to look for tags and attributes called concat, i.e. that the parser is limited to special-casingconcat only where it's used as a function.

@jjolly jjolly force-pushed the etree_xpath_concat branch from 67c0d33 to 8128317 Compare May 15, 2019 18:39
@jjolly jjolly force-pushed the etree_xpath_concat branch from 8128317 to 3d714fa Compare May 15, 2019 18:46
@jjolly
Copy link
Contributor Author

jjolly commented May 15, 2019

One more thing: there should also be tests that it's still possible to look for tags and attributes called concat, i.e. that the parser is limited to special-casingconcat only where it's used as a function.

I have added a new tag and attributes in order to test concat in various places within the XML.

@jjolly jjolly force-pushed the etree_xpath_concat branch from 3d714fa to 75dff0a Compare May 15, 2019 18:55
@jjolly
Copy link
Contributor Author

jjolly commented May 15, 2019

...and the spelling changes. Sorry for the multiple pushes.

@jjolly jjolly force-pushed the etree_xpath_concat branch from 75dff0a to 67a1c19 Compare May 15, 2019 20:03
@csabella csabella requested a review from scoder February 7, 2020 12:20
try:
token = next()
except StopIteration:
raise SyntaxError("missing closing parathesis for concat")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise SyntaxError("missing closing parathesis for concat")
raise SyntaxError("missing closing parenthesis for concat")

Comment on lines +2595 to +2601
self.assertEqual(e.find('./tag[@class=concat("d")]').attrib['class'], 'd')
self.assertEqual(e.find('./tag[@class=concat("", \'d\', "")]').attrib['class'], 'd')
self.assertEqual(e.find('./tag[@value="concat"]').attrib['class'], 'b')
self.assertEqual(e.find('./tag[@value=concat("concat")]').attrib['class'], 'b')
self.assertEqual(e.find('./tag[@concat="value"]').attrib['class'], 'd')
self.assertEqual(e.find('./tag[@concat=concat("value")]').attrib['class'], 'd')
self.assertEqual(e.find('./concat[@class=concat("x")]').attrib['class'], 'x')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but … none of these tests represents a real use case for concat(), does it? What about concat("prefix-", @attr) or concat("'", ., "'") ? Those seem to be the really interesting cases to me.

I understand that your intention was to allow for escaping, but a) I can't see that being covered by the tests and b) does that really make a use case for concat() already?

My fear is that we're opening up a feature here which users will bump into and be disappointed because it doesn't actually do what they want. It really just serves the tiny use case of concatenating string constants right now. That's … unexpected at best.

@MaxwellDupre
Copy link
Contributor

Could you amend versionchanged, please? Lets try for 3.11.

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Aug 14, 2022
@scoder
Copy link
Contributor

scoder commented Aug 14, 2022

Closing since the implementation isn't complete while the solution is not targeted to the specific problem. It seems that we'd add more expectations than solutions here.

@scoder scoder closed this Aug 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review stale Stale PR or inactive for long period of time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants