[Bug?] gremlinpython is hanged up or not recovering connection after connection error has occurred
Hello, TinkerPop team.
I am struggling to avoid problems after a connection error occur.
And now, I suspect it might be led by something bug of gremlinpython...
Are these bugs? Or just I use it wrongly?
Please let me know.
Best Regards,
When I specify wrong url to simulate network error,
gremlinpython might consume connections and do not return them into the pool.
So, below script is hanged up after all pooled connections are consumed.
Python Script: see
The Output: see
The result is changed when I specify different value to
My expectation is that error messages are shown in 9 times and the script ends.
Same as case 1, manual transaction is never ended.
So, I cannot recover the error.
Python Script: see
The Output: see
My expectation is that this script is end after trying 9 times and all trials are failed.
After I stopped TinkerPop server(JanusGraph) temporary,
some pooled connections are broken and will not be recovered.
Python Script: see
The Output: see
My expectation is that connections are refreshed if they are not available when get them from the pool.
I am struggling to avoid problems after a connection error occur.
And now, I suspect it might be led by something bug of gremlinpython...
Are these bugs? Or just I use it wrongly?
Please let me know.
Best Regards,
environments
- wsl2 on Windows11 (Ubuntu)
- Python 3.12.4
- gremlinpython 3.7.2
- TinkerPop server: JanusGraph 1.0.0
Case 1: Script is hanged up when all pooled connections are consumed?
When I specify wrong url to simulate network error,
gremlinpython might consume connections and do not return them into the pool.
So, below script is hanged up after all pooled connections are consumed.
Python Script: see
case1.pyThe Output: see
case1-output.txtThe result is changed when I specify different value to
pool_size argument.My expectation is that error messages are shown in 9 times and the script ends.
Case 2: Manual transaction is never rolled back(closed)
Same as case 1, manual transaction is never ended.
So, I cannot recover the error.
Python Script: see
case2.pyThe Output: see
case2-output.pyMy expectation is that this script is end after trying 9 times and all trials are failed.
Case 3: Once a connection error occurred, pooled connections are broken
After I stopped TinkerPop server(JanusGraph) temporary,
some pooled connections are broken and will not be recovered.
Python Script: see
case3.pyThe Output: see
case3-output.txtMy expectation is that connections are refreshed if they are not available when get them from the pool.
case1.py737B
case1-output.txt502B
case2.py996B
case2-output.txt333B
case3.py769B
case3-output.txt2.32KB
Solution
What you're noticing here kind of boils down to how connection pooling works in gremlin-python. The pool is really just a queue that the connection adds itself back to after either an error or a success but it's missing some handling for the scenarios you pointed out. One of the main issues is that the pool itself can't determine if a connection is healthy or if it unhealthy and should be removed from the pool.
I think you should go ahead and make a Jira for this. If it's easier for you, I can help you make one that references this post. I think the only workaround right now is to occasionally open a new Client to create a new pool of connections when you notice some of those exceptions.
I think you should go ahead and make a Jira for this. If it's easier for you, I can help you make one that references this post. I think the only workaround right now is to occasionally open a new Client to create a new pool of connections when you notice some of those exceptions.