Skip to content

Low memory XML decoding (parsing scans iteratively)#23

Open
lomereiter wants to merge 2 commits into
ISA-tools:masterfrom
alexandrovteam:lowmem
Open

Low memory XML decoding (parsing scans iteratively)#23
lomereiter wants to merge 2 commits into
ISA-tools:masterfrom
alexandrovteam:lowmem

Conversation

@lomereiter

@lomereiter lomereiter commented Mar 1, 2017

Copy link
Copy Markdown

This PR provides an alternative solution to #13: each scan is parsed once, all necessary information is extracted from it, then the node is freed.
On a large imzML file this brought top memory consumption from 2.5GB down to 90MB, albeit the processing time increased from 6s to 13s.

@althonos

althonos commented Mar 1, 2017

Copy link
Copy Markdown
Member

This seems a lot cleaner than what we hacked through at first. I'll review it as soon as I can.

@Tomnl

Tomnl commented Mar 1, 2017

Copy link
Copy Markdown
Member

Thanks @lomereiter, this is a great contribution.

No unit tests yet... but it seems to be passing the travis and Appveyor tests with no problem

@Tomnl

Tomnl commented Apr 3, 2017

Copy link
Copy Markdown
Member

Hey @althonos, do you think we should merge this now? or perhaps we should wait until we have the unit test functionality?

@althonos

althonos commented Apr 3, 2017 via email

Copy link
Copy Markdown
Member

@althonos

althonos commented Apr 3, 2017

Copy link
Copy Markdown
Member

Maybe (because of the increased time) we should still leave both methods and let the user choose (like lxml.etree.iterparse allows to give a huge_tree parameter).

@Tomnl

Tomnl commented Apr 4, 2017

Copy link
Copy Markdown
Member

Yeah I think you are right @althonos, keeping both methods seems like the best idea as memory consumption might not be a problem for some.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants