Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

WIP: Independent child process monitoring #118#134

Merged
fabianp merged 9 commits intopythonprofilers:masterfrom
bbengfort:master
Mar 21, 2017
Merged

WIP: Independent child process monitoring #118#134
fabianp merged 9 commits intopythonprofilers:masterfrom
bbengfort:master

Conversation

@bbengfort
Copy link
Contributor

This pull request is a start at resolving#118 -- as mentioned in that issue, I've created a function_get_child_memory that iterates through all child processes yielding their memory. This is summed if theinclude_children flag is true.

I've also updated the reference from#71 to wrap the entire yield in try/except and yield 0.0 if the race condition occurs; but this needs to be checked.

Including this into the mainmprof library is the next step. I've created a demompmprof which logs child and main process memory separately in th same.dat file, then plots them correctly. Here is an example:

multiprocessing_example

The next steps are to mergempmprof withmprof with some flag, e.g.independent_children or something like that (suggestions welcome) to make it behave likempmprof and to have theplot command correctly handle both cases.

raise NotImplementedError('The psutil module is required when to'
' monitor memory usage of children'
' processes')
raise NotImplementedError((
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry about this, I think this was just a carry over from the historical merge; I can reset back to the original if needed.

mpmprof Outdated
lines += 1

# Collect memory usage of spawned children and write to profile
for idx, mem in enumerate(mp._get_child_memory(proc.pid)):
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So here is the line where I independently collect the memory of each child and add them to the .dat file with the "CHLD#" key. Ifinclude_children=True then "main" will be the total memory consumed, and if not, it will be only the main process, but all processes will be plotted withindependent_children=True or whatever flag we come up with.

mpmprof Outdated
mpoint = (0, 0)

# Plot all of the series, the main process and the child.
for proc, data in series.items():
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

And then obviously, plot needs to be adapted to plot all the children.

@bbengfort
Copy link
ContributorAuthor

mpmprof uses argparse e.g.#128

@fabianp
Copy link
Collaborator

Thanks@bbengfort! . Do you think it is possible to have only one executable file, i.e., merge mpmprof into mprof?

@bbengfort
Copy link
ContributorAuthor

@fabianp yes, it's possible; just not very straightforward. I'll take a crack at it.

@fabianp
Copy link
Collaborator

fabianp commentedMar 20, 2017 via email

Great!
On 19 Mar 2017, at 11:24, Benjamin Bengfort ***@***.***> wrote:@fabianp <https://github.com/fabianp> yes, it's possible; just not very straightforward. I'll take a crack at it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/fabianp/memory_profiler/pull/134#issuecomment-287606977>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAQ8h8D_Fjoqwm0monZTeaGq2RMoQTR2ks5rnQJpgaJpZM4MhiXU>.

@bbengfort
Copy link
ContributorAuthor

@fabianp ok, I've pushed a version that does it - would you mind taking a look and running your tests/examples?

@fabianp
Copy link
Collaborator

Great work@bbengfort ! The code looks great, let me try it on some of my own problems. In the meantime, you can add your name to the Authors part of the README :-)

@bbengfort
Copy link
ContributorAuthor

@fabianp thanks! I've updated the README with the example and usage and also added max value markers to the plot for the largest child.

@fabianp
Copy link
Collaborator

If I run the example withmprof run -M multiprocessing_example.py then it fails with an _pickle.PicklingError exception. It works however withmprof run -M python multiprocessing_example.py (not the python after -M). However, both should give the same result as the python mode should be picked up by the .py extension.

the total memory of the program as well as each child individually.

.. warning:: currently the child tracking only works if a ``stream`` is provided to the ``profile`` (e.g. from the command line or in the decorator).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

How hard do you think it would be to remove this limitation?, i.e., to track children even if the function is not@Profile'd

the total memory of the program as well as each child individually.

.. warning:: Currently the child tracking only works if a ``stream`` is provided to the ``profile`` (e.g. from the command line or in the decorator). If you are using the API to retrieve values then the flag will not do anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I am confused, as the example works fine even without decorated functions

Copy link
ContributorAuthor

@bbengfortbbengfortMar 20, 2017
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I meant that the stream argument has to be passed in, ifstream = None then I can't return multiple values. I could if you wanted me to return a list instead of a single float. I didn't necessarily mean it wouldn't work with the decorator.

Perhaps I could rephrase as follows:

"Currently tracking individual children requires a reporting output stream; if you'd like direct access to the memory usage of individual children, see the_get_child_memory function."

Though of course, that then points the user to an internal function. I'm open to suggestions.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've put placeholder comments in the code (line 363 and 403) where this issue occurs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah, thanks for the explanation. I think the API would be simpler if we allow memory_usage to return a nested list instead embedding this into the stream. What do you think?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, I've updatedmemory_usage to return a nested list and updated the README accordingly.

@bbengfort
Copy link
ContributorAuthor

Unfortunately, I think that the pickle error is a product of multiprocessing library: functions can only be pickled if they're defined at the top level and when you run it by creating a subprocess inmprof, the example function is no longer at the top level and therefore cannot be pickled.

I can see if I can change the example (though that may be difficult), but I also wanted to note that the problem exists even with the original version of themprof, so it's not related to the code changes I proposed here. It's likely that any user that uses the multiprocessing library would have to use the python command to ensure their code is serialized/forked correctly.

http://stackoverflow.com/questions/8804830/python-multiprocessing-pickling-error

stream.write("CHLD {0} {1:.6f} {2:.4f}\n".format(idx, chldmem, time.time()))
else:
ret.append(mem_usage)
if multiprocess:
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Here is the first place where I don't return multiple values and instead warn

ret.append(mem_usage)

if multiprocess:
warnings.warn("use include_children not multiprocess without a stream")
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Here is the second place where I don't return multiple values and instead warn

@fabianp
Copy link
Collaborator

For the pickling, what I find strange is that both commands (with and without the "python" word in mprof) should be identical, as per lines 217-223 of mprof. Not sure why multiprocessing fails only on one of these cases since they should be executed the same way.

@bbengfort
Copy link
ContributorAuthor

So I see what you mean, however there is a subtle difference. The args to Popen ifoptions.python = True are:

['python', '-m', 'memory_profiler', '--timestamp', '-o', 'mprofile_20170321080505.dat', 'examples/multiprocessing_example.py']

Whereas the args to Popen if passed as an executable are

`['python', 'examples/multiprocessing_example.py']`

In the first case,__main__ ismemory_profiler.__main__ andexamples/multiprocessing_example.py is run withexecfile() on L1108 orexec() on L1119. This is correctly implemented (as discussedhere) by embedding the exec into its own namespace (ns), thus making the exec'd file the equivalent of an import -- as a result, the functions inside examples/multiprocessing_example.py are not at the top level (but are nested under ns) and therefore cannot be pickled.

So that took me a while to figure out, but I think it makes sense. What do you think?

@fabianpfabianp merged commitd333fda intopythonprofilers:masterMar 21, 2017
@fabianp
Copy link
Collaborator

Thanks for the explanation and your time. I've merged the PR and added a small patch (52c5788) so that the example can be run without the python keyword.

Thanks!

@bbengfort
Copy link
ContributorAuthor

Sure thing, let me know if there is anything else I can help with; sorry that this feature took so long!

@rokroskarrokroskar mentioned this pull requestApr 6, 2017
except psutil.NoSuchProcess:
# https://github.com/fabianp/memory_profiler/issues/71
pass
mem += sum(_get_child_memory(process, meminfo_attr))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hi, quick question, as processes in modern linux share common memory after forking from their parent and only copy it on writes, wouldn't thissum report the wrong number?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unfortunately I'm not sure of the behavior of psutil with regards to linux forks; I simply used the same meminfo_attr as the summation function. If that's true then this is indeed a problem, but I'd guess that we'd have to dig into psutil to figure it out.

@fabianp
Copy link
Collaborator

fabianp commentedJun 12, 2017 via email

It would indeed be great to know (but I'm clueless here)On Jun 12, 2017 12:09 PM, "Benjamin Bengfort" <notifications@github.com> wrote:*@bbengfort* commented on this pull request.------------------------------In memory_profiler.py<https://github.com/fabianp/memory_profiler/pull/134#discussion_r121500525>:
@@ -111,12 +143,7 @@ def ps_util_tool():
else 'get_memory_info' mem = getattr(process, meminfo_attr)()[0] / _TWO_20 if include_children:- try:- for p in process.children(recursive=True):- mem += getattr(p, meminfo_attr)()[0] / _TWO_20- except psutil.NoSuchProcess:- #https://github.com/fabianp/memory_profiler/issues/71- pass+ mem += sum(_get_child_memory(process, meminfo_attr))Unfortunately I'm not sure of the behavior of psutil with regards to linuxforks; I simply used the same meminfo_attr as the summation function. Ifthat's true then this is indeed a problem, but I'd guess that we'd have todig into psutil to figure it out.—You are receiving this because you modified the open/close state.Reply to this email directly, view it on GitHub<https://github.com/fabianp/memory_profiler/pull/134#discussion_r121500525>,or mute the thread<https://github.com/notifications/unsubscribe-auth/AAQ8h-Xtff8eOlmZVoMv3dePOSis4Wziks5sDYzxgaJpZM4MhiXU>.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@fabianpfabianpfabianp requested changes

+1 more reviewer

@petroslambpetroslambpetroslamb left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@bbengfort@fabianp@petroslamb

[8]ページ先頭

©2009-2026 Movatter.jp