- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 33.5k
test: fix test-tls-ticket-cluster.js #52431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix test-tls-ticket-cluster.js #52431
Conversation
Ensured connection end is initiated by worker (server) Fixes: nodejs#2510
| c.on('end', () => { | ||
| c.end(); | ||
| }); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| c.on('end', () => { | |
| c.end(); | |
| }); | 
Removing the whole 'secureConnect' listener should have the same effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review. I applied your suggestion and tested it locally. I ran it over 4000 times and got 2 timeouts. The following is the test log:
E:\github\node>python tools\test.py test\parallel\test-tls-ticket-cluster
=== release test-tls-ticket-cluster ===
Path: parallel/test-tls-ticket-cluster
spawn file: E:\github\node\out\Release\node.exe
spawn file: E:\github\node\out\Release\node.exe
spawn file: E:\github\node\out\Release\node.exe
spawn file: E:\github\node\out\Release\node.exe
[primary] got "listening"
[primary] connecting 49193 session? false
[primary] got "listening"
[primary] got "listening"
[primary] got "listening"
[worker] connection reused? false
[primary] got "not-reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[primary] connecting 49193 session? true
[worker] connection reused? true
[primary] got "reused"
[worker] got "die"
[worker] server close
[worker] exit
[worker] got "die"
[worker] server close
[worker] exit
[worker] got "die"
[worker] server close
[worker] exit
[primary] worker died
[worker] got "die"
[worker] server close
[worker] exit
[primary] worker died
[primary] worker died
[primary] worker died
Command: E:\github\node\out\Release\node.exe E:\github\node\test\parallel\test-tls-ticket-cluster.js
--- TIMEOUT ---
[02:32|% 100|+   0|-   1]: Done
Failed tests:
E:\github\node\out\Release\node.exe E:\github\node\test\parallel\test-tls-ticket-cluster.js
Since the initial solution doesn't have that issue, I'd go with it to avoid timeouts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can reproduce the timeouts also with the initial version then as the only difference is that c.end() is called in the next tick. See 
node/lib/internal/streams/readable.js
Line 1699 in 9ef03f1
| process.nextTick(endWritableNT, stream); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine either way, as is or with the 'secureListener' event listener removed. The result is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left the test running overnight and saw the error, which is related to the handshaking. It appears that removing secureConnect doesn't have the same effect as adding an end callback. Do you have any other suggestion I could try here?
P.S. I applied your other suggestion, the code looks better now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left the test running overnight and saw the error, which is related to the handshaking. It appears that removing
secureConnectdoesn't have the same effect as adding anendcallback. Do you have any other suggestion I could try here?
I have no suggestions. What is currently done in the 'end' callback, is already done automatically even if the callback is not added. The only difference is that c.end() is called in the next tick. This makes it explicit and I'm fine with it.
Co-authored-by: Luigi Pinca <luigipinca@gmail.com>
|  | 
| Landed in cfca9e3 | 
Problem
I've investigated the related issue in depth and found out that this test sometimes fails because of the handshake process. When the error happens and I look at the Wireshark logs, I see that the client doesn't send a
Change Cipher Specpacket, but instead sends aFINpacket after a while. TheFINpacket is sent in approximately 10ms. This may be some timeout, but I wasn't able to find proof for that.I've added 2 screenshots from the Wireshark logs (which I can provide too if requested). The first one shows the expected packet flow when the test passes, and the second one shows the packet flow when the test fails.
Pass:
Failure:
As can be seen, the
Change Cipher Specpacket was missing, and theFINpacket had already been sent before the handshake was complete.I've seen discussions about the issues in the handshake process of Node.js in the following links: openssl/openssl#8069, openssl/openssl#7199, #18770.
As I don't have in-depth knowledge about OpenSSL, I'm not sure if this issue is caused by the way OpenSSL is used by Node. If there is a need to investigate it, a new issue can be opened.
Fix
In this PR, I’ve made the
Primary(client) wait for theFINpacket from theWorker(server) to close the connection. This approach ensures that thePrimarywaits until it receives aFINpacket from theWorkerand doesn’t time out during the handshake. I've run the test with the fix roughly 11k times and it hasn't failed once. Without the fix, I would usually get a failure with 1k runs easily.Fixes: #2510