Discussion:
known limit to size of expressions?
(too old to reply)
Kevin Hunter
2012-02-13 05:48:56 UTC
Permalink
Hullo List,

I have a fairly small expression (~500 variables). When I create a
Relation with it, Python informs me that Sympy has exceeded the max
recursion depth. Is there an intentional limit to expression sizes in
Sympy?

Please find attached a mini-script that highlights the issue I'm
encountering. You should be able to just run it:

$ python recreate_sympy_recursion_issue.py

Thanks,

Kevin
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/YPHoqiYIUuYJ.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Aaron Meurer
2012-02-13 06:26:10 UTC
Permalink
No, this is not intentional. Python sets the recursion limit to 1000,
to prevent infinit recursion from causing a stack overflow (in C).
Since many of our functions are highly recursive, we often run into
this limit.

You can raise it with sys.setrecursionlimit. I set it to 10000 in
your script, and it worked. The highest possible limit is platform
dependent, and raising it too high could cause some performance issues
(I'm not sure about this, though, you'll have to play with it).

What we need to do is rewrite as many recursive operations as we can
to be non-recursive. Unfortunately, due to the recursive nature of
SymPy objects (nested objects in .args), some operatios lend
themselves to this very nicely.

In this particular case, it's determining if an Add is positive by
peeling off one term at a time (see line 440 of add.py), which I think
should be changed (not only is this needlessly recursive, but it's
also inefficient, because it recomputes Add.flatten each time, making
it a O(n**2) operation).

If you have an expression that isn't highly nested and you run into
this problem, it's likely that whatever algorithm causes it can be
rewritten to be non-recursive. The real problem comes when your
expression really is highly nested. For example, you'll have
difficulty dealing with the following expression with the default
recursion depth:

x = Symbol('x')
a
for i in xrange(1000):
x = x*(1 - a)

Aaron Meurer
Post by Kevin Hunter
Hullo List,
I have a fairly small expression (~500 variables).  When I create a Relation
with it, Python informs me that Sympy has exceeded the max recursion depth.
 Is there an intentional limit to expression sizes in Sympy?
Please find attached a mini-script that highlights the issue I'm
$ python recreate_sympy_recursion_issue.py
Thanks,
Kevin
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/sympy/-/YPHoqiYIUuYJ.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/sympy?hl=en.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Kevin Hunter
2012-02-13 06:37:11 UTC
Permalink
:-( Bummer. In this context, I regularly work with expressions with
thousands (sometimes tens of thousands) of variables. They aren't highly
nested, with most terms having less than 5 levels, but there are lots of
them.

Thanks for letting me know.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/xOXxLp9oOasJ.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Aaron Meurer
2012-02-13 07:53:16 UTC
Permalink
This particular bug should be fixable (see my previous emails). If
this function is rewritten to be non-recursive, it will be one
function call, no matter how big the expression is. As long as you
aren't using the polys, tons of variables shouldn't be a big problem.

Aaron Meurer
:-(  Bummer.  In this context, I regularly work with expressions with
thousands (sometimes tens of thousands) of variables.  They aren't highly
nested, with most terms having less than 5 levels, but there are lots of
them.
Thanks for letting me know.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/sympy/-/xOXxLp9oOasJ.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/sympy?hl=en.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Chris Smith
2012-02-15 21:55:34 UTC
Permalink
:-(  Bummer.  In this context, I regularly work with expressions with
thousands (sometimes tens of thousands) of variables.  They aren't highly
nested, with most terms having less than 5 levels, but there are lots of
them.
10,000 terms now takes just over a second to process with the
modifications I made in

https://github.com/sympy/sympy/pull/1055

Perhaps you could try your expression there.

/c
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Kevin Hunter
2012-02-15 22:25:14 UTC
Permalink
Hmm, have there been API changes recently? With your branch, I'm not able
to check an Equality expression for the attribute is_Relational:

*>>> from sympy import **
*>>> from sympy.abc import x, y*
*>>> Eq(x, y).is_Relational*
*Traceback (most recent call last):*
* File "<stdin>", line 1, in <module>*
*AttributeError: 'Equality' object has no attribute 'is_Relational'*
*>>> *
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/52cymGHgmN4J.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Ronan Lamy
2012-02-15 23:04:43 UTC
Permalink
Post by Kevin Hunter
Hmm, have there been API changes recently? With your branch, I'm not
Post by Kevin Hunter
from sympy import *
from sympy.abc import x, y
Eq(x, y).is_Relational
File "<stdin>", line 1, in <module>
AttributeError: 'Equality' object has no attribute 'is_Relational'
Bah, you'd better off avoiding the is_Class attributes anyway, they're
just confusing. Use "isinstance(Eq(x, y), Relational)" instead.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Aaron Meurer
2012-02-15 23:18:51 UTC
Permalink
Even so, it works for me, both in master and in Chris's branch. So
either it was a problem and he fixed it, or you're doing something
wrong. Make sure you checkout is clean (git status), refetch his
branch, and try again.

Aaron Meurer
Post by Ronan Lamy
Hmm, have there been API changes recently?  With your branch, I'm not
Post by Kevin Hunter
from sympy import *
from sympy.abc import x, y
Eq(x, y).is_Relational
  File "<stdin>", line 1, in <module>
AttributeError: 'Equality' object has no attribute 'is_Relational'
Bah, you'd better off avoiding the is_Class attributes anyway, they're
just confusing. Use "isinstance(Eq(x, y), Relational)" instead.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Kevin Hunter
2012-02-15 23:48:15 UTC
Permalink
Bah, thank you. I had checked out Chris' repository, but had forgotten to
change branches.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/C-_1H8aPEDUJ.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Kevin Hunter
2012-02-15 23:55:39 UTC
Permalink
Can you expand on why? To me the former reads better, and elides the need
to specifically import the Relational class:

*>>> x.is_Relational*
*True*

vs

*>>> from sympy.core.relational import Relational*
*>>> isinstance(x, Relational)' reads better to me than 'isinstance(x,
Relational)*

Further, the former is faster:

*In [11]: %timeit a.is_Relational*
*10000000 loops, best of 3: 76.3 ns per loop*
*
*
*In [12]: %timeit isinstance(a, Relational)*
*1000000 loops, best of 3: 289 ns per loop*
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/JCRlfe33S_YJ.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Ronan Lamy
2012-02-16 01:35:23 UTC
Permalink
Post by Kevin Hunter
Can you expand on why? To me the former reads better, and elides the
"There should be one-- and preferably only one --obvious way to do it."

.is_Relational is an ad-hoc reimplementation of isinstance(), with
Post by Kevin Hunter
Relational.is_Relational
True
Post by Kevin Hunter
Eq(x, y).is_Equality
True
Post by Kevin Hunter
Ne(x, y).is_Unequality
Traceback (most recent call last):
File "<ipython-input-21-c44de31e37c4>", line 1, in <module>
Ne(x, y).is_Unequality
AttributeError: 'Unequality' object has no attribute 'is_Unequality'

OTOH, isinstance() does what it says and its meaning is clear to any
Python programmer - there is no need to know beforehand that in sympy
is_SomeClass means isinstance(..., SomeClass) nor that is_integer is
completely different from is_Integer.
Post by Kevin Hunter
Post by Kevin Hunter
x.is_Relational
True
vs
Post by Kevin Hunter
from sympy.core.relational import Relational
isinstance(x, Relational)' reads better to me than 'isinstance(x,
Relational)
It's actually a good thing that isinstance() is a bit cumbersome to
write. No matter how you spell it, such code introduces a tight coupling
with the class tested (therefore an explicit import is better than an
implicit coupling), and is hard to extend or override. isinstance()
calls also have a tendency to cluster in long elif chains, which devolve
quickly into spaghetti code. See, for instance,
sympy.core.function.count_ops().
Post by Kevin Hunter
In [11]: %timeit a.is_Relational
10000000 loops, best of 3: 76.3 ns per loop
In [12]: %timeit isinstance(a, Relational)
1000000 loops, best of 3: 289 ns per loop
That kind of difference is usually completely negligible compared to
algorithmic improvements.

One last thing: all the is_Foo attributes live in the same namespace, so
if you want to have 2 classes with the same name, you can't (BTW, the
same problem exists in printers and ask handlers, with worse
consequences).

"Namespaces are one honking great idea -- let's do more of those!"
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Joachim Durchholz
2012-02-16 08:19:16 UTC
Permalink
Post by Kevin Hunter
*In [11]: %timeit a.is_Relational*
*10000000 loops, best of 3: 76.3 ns per loop*
*
*
*In [12]: %timeit isinstance(a, Relational)*
*1000000 loops, best of 3: 289 ns per loop*
Microbenchmark results are, in themselves, meaningless. For example,
measuring something that takes up negligible amounts of time in any real
application isn't worth optimizing and hence not measuring.

Sometimes they are even wrong. You might be measuring loop setup time,
and if the interpreter is doing different things depending on whether
you're using is_Class or isinstance, you'll get bogus results that won't
replicate in real-life applications. (For example, the interpreter might
optimize one kind of loop but not the other. Or it might grok that the
loop has no effect and optimize it out, and what you're measuring is how
long it took the interpreter to realize it. Or a gazillion of other
things - unless you know *exactly* what the interpreter is doing, you'll
be limited to guesswork, which means you can't rely on any conclusions.)

In this case, you're comparing apples and oranges. is_Relational is
checking something different than isinstance (and Ronan says
is_Relational is checking the wrong thing).
In other words, the difference would be meaningless even if the
measurement were accurate.

Just my 2c.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Kevin Hunter
2012-02-16 00:05:58 UTC
Permalink
Excellent! For my simple test case, this now no longer goes over the 1000
recursion limit.

As an FYI, with your change, my models using Sympy now only use 45 levels
of recursion. (As tested by the "if it broke test" binary test.)

I really appreciate your work on this. Cheers!
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/7LZgekkDJ7UJ.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Chris Smith
2012-02-13 06:45:31 UTC
Permalink
Post by Aaron Meurer
In this particular case, it's determining if an Add is positive by
peeling off one term at a time (see line 440 of add.py), which I think
should be changed (not only is this needlessly recursive, but it's
also inefficient, because it recomputes Add.flatten each time, making
it a O(n**2) operation).
If I'm not mistaken, it hasn't done this since about 4500 commits ago
:-) (SHA1 40853ad65801838c45b5). But it *is* recursive.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Aaron Meurer
2012-02-13 07:51:33 UTC
Permalink
Oh, so it's not that bad. Still, there's no reason why this has to be
implemented recursively. Not only will it avoid recursion error
problems, but it will be faster if it's written non-recursively, as
function calls in Python are somewhat expensive.

For example, look at the speedup I had rewriting a function deep in
the polys to be non-recursive a while back by reading the commit
message of 4cb7aa93601c16d26767858175d7b1b4c04e9fca.

Aaron Meurer
Post by Chris Smith
Post by Aaron Meurer
In this particular case, it's determining if an Add is positive by
peeling off one term at a time (see line 440 of add.py), which I think
should be changed (not only is this needlessly recursive, but it's
also inefficient, because it recomputes Add.flatten each time, making
it a O(n**2) operation).
If I'm not mistaken, it hasn't done this since about 4500 commits ago
:-) (SHA1 40853ad65801838c45b5). But it *is* recursive.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
--
You received this message because you are subscribed to the Google Groups "sympy" group.
To post to this group, send email to sympy-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to sympy+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
Continue reading on narkive:
Loading...